summaryrefslogtreecommitdiff
path: root/fs/ksmbd/vfs.c
AgeCommit message (Collapse)AuthorFilesLines
2022-11-25vfs: fix copy_file_range() averts filesystem freeze protectionAmir Goldstein1-3/+3
Commit 868f9f2f8e00 ("vfs: fix copy_file_range() regression in cross-fs copies") removed fallback to generic_copy_file_range() for cross-fs cases inside vfs_copy_file_range(). To preserve behavior of nfsd and ksmbd server-side-copy, the fallback to generic_copy_file_range() was added in nfsd and ksmbd code, but that call is missing sb_start_write(), fsnotify hooks and more. Ideally, nfsd and ksmbd would pass a flag to vfs_copy_file_range() that will take care of the fallback, but that code would be subtle and we got vfs_copy_file_range() logic wrong too many times already. Instead, add a flag to explicitly request vfs_copy_file_range() to perform only generic_copy_file_range() and let nfsd and ksmbd use this flag only in the fallback path. This choise keeps the logic changes to minimum in the non-nfsd/ksmbd code paths to reduce the risk of further regressions. Fixes: 868f9f2f8e00 ("vfs: fix copy_file_range() regression in cross-fs copies") Tested-by: Namjae Jeon <linkinjeon@kernel.org> Tested-by: Luis Henriques <lhenriques@suse.de> Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-10-07Merge tag '6.1-rc-ksmbd-fixes' of git://git.samba.org/ksmbdLinus Torvalds1-12/+24
Pull ksmbd updates from Steve French: - RDMA (smbdirect) fixes - fixes for SMB3.1.1 POSIX Extensions (especially for id mapping) - various casemapping fixes for mount and lookup - UID mapping fixes - fix confusing error message - protocol negotiation fixes, including NTLMSSP fix - two encryption fixes - directory listing fix - some cleanup fixes * tag '6.1-rc-ksmbd-fixes' of git://git.samba.org/ksmbd: (24 commits) ksmbd: validate share name from share config response ksmbd: call ib_drain_qp when disconnected ksmbd: make utf-8 file name comparison work in __caseless_lookup() ksmbd: Fix user namespace mapping ksmbd: hide socket error message when ipv6 config is disable ksmbd: reduce server smbdirect max send/receive segment sizes ksmbd: decrease the number of SMB3 smbdirect server SGEs ksmbd: Fix wrong return value and message length check in smb2_ioctl() ksmbd: set NTLMSSP_NEGOTIATE_SEAL flag to challenge blob ksmbd: fix encryption failure issue for session logoff response ksmbd: fix endless loop when encryption for response fails ksmbd: fill sids in SMB_FIND_FILE_POSIX_INFO response ksmbd: set file permission mode to match Samba server posix extension behavior ksmbd: change security id to the one samba used for posix extension ksmbd: update documentation ksmbd: casefold utf-8 share names and fix ascii lowercase conversion ksmbd: port to vfs{g,u}id_t and associated helpers ksmbd: fix incorrect handling of iterate_dir MAINTAINERS: remove Hyunchul Lee from ksmbd maintainers MAINTAINERS: Add Tom Talpey as ksmbd reviewer ...
2022-10-05ksmbd: make utf-8 file name comparison work in __caseless_lookup()Atte Heikkilä1-3/+17
Case-insensitive file name lookups with __caseless_lookup() use strncasecmp() for file name comparison. strncasecmp() assumes an ISO8859-1-compatible encoding, which is not the case here as UTF-8 is always used. As such, use of strncasecmp() here produces correct results only if both strings use characters in the ASCII range only. Fix this by using utf8_strncasecmp() if CONFIG_UNICODE is set. On failure or if CONFIG_UNICODE is not set, fallback to strncasecmp(). Also, as we are adding an include for `linux/unicode.h', include it in `fs/ksmbd/connection.h' as well since it should be explicit there. Signed-off-by: Atte Heikkilä <atteh.mailbox@gmail.com> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2022-10-05ksmbd: constify struct pathAl Viro1-2/+2
... in particular, there should never be a non-const pointers to any file->f_path. Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Steve French <stfrench@microsoft.com>
2022-10-05ksmbd: don't open-code %pDAl Viro1-8/+6
a bunch of places used %pd with file->f_path.dentry; shorter (and saner) way to spell that is %pD with file... Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Steve French <stfrench@microsoft.com>
2022-08-18Change calling conventions for filldir_tAl Viro1-8/+6
filldir_t instances (directory iterators callbacks) used to return 0 for "OK, keep going" or -E... for "stop". Note that it's *NOT* how the error values are reported - the rules for those are callback-dependent and ->iterate{,_shared}() instances only care about zero vs. non-zero (look at emit_dir() and friends). So let's just return bool ("should we keep going?") - it's less confusing that way. The choice between "true means keep going" and "true means stop" is bikesheddable; we have two groups of callbacks - do something for everything in directory, until we run into problem and find an entry in directory and do something to it. The former tended to use 0/-E... conventions - -E<something> on failure. The latter tended to use 0/1, 1 being "stop, we are done". The callers treated anything non-zero as "stop", ignoring which non-zero value did they get. "true means stop" would be more natural for the second group; "true means keep going" - for the first one. I tried both variants and the things like if allocation failed something = -ENOMEM; return true; just looked unnatural and asking for trouble. [folded suggestion from Matthew Wilcox <willy@infradead.org>] Acked-by: Christian Brauner (Microsoft) <brauner@kernel.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-08-09Merge tag '5.20-rc-ksmbd-server-fixes' of git://git.samba.org/ksmbdLinus Torvalds1-2/+6
Pull ksmbd updates from Steve French: - fixes for memory access bugs (out of bounds access, oops, leak) - multichannel fixes - session disconnect performance improvement, and session register improvement - cleanup * tag '5.20-rc-ksmbd-server-fixes' of git://git.samba.org/ksmbd: ksmbd: fix heap-based overflow in set_ntacl_dacl() ksmbd: prevent out of bound read for SMB2_TREE_CONNNECT ksmbd: prevent out of bound read for SMB2_WRITE ksmbd: fix use-after-free bug in smb2_tree_disconect ksmbd: fix memory leak in smb2_handle_negotiate ksmbd: fix racy issue while destroying session on multichannel ksmbd: use wait_event instead of schedule_timeout() ksmbd: fix kernel oops from idr_remove() ksmbd: add channel rwlock ksmbd: replace sessions list in connection with xarray MAINTAINERS: ksmbd: add entry for documentation ksmbd: remove unused ksmbd_share_configs_cleanup function
2022-08-04ksmbd: fix heap-based overflow in set_ntacl_dacl()Namjae Jeon1-0/+5
The testcase use SMB2_SET_INFO_HE command to set a malformed file attribute under the label `security.NTACL`. SMB2_QUERY_INFO_HE command in testcase trigger the following overflow. [ 4712.003781] ================================================================== [ 4712.003790] BUG: KASAN: slab-out-of-bounds in build_sec_desc+0x842/0x1dd0 [ksmbd] [ 4712.003807] Write of size 1060 at addr ffff88801e34c068 by task kworker/0:0/4190 [ 4712.003813] CPU: 0 PID: 4190 Comm: kworker/0:0 Not tainted 5.19.0-rc5 #1 [ 4712.003850] Workqueue: ksmbd-io handle_ksmbd_work [ksmbd] [ 4712.003867] Call Trace: [ 4712.003870] <TASK> [ 4712.003873] dump_stack_lvl+0x49/0x5f [ 4712.003935] print_report.cold+0x5e/0x5cf [ 4712.003972] ? ksmbd_vfs_get_sd_xattr+0x16d/0x500 [ksmbd] [ 4712.003984] ? cmp_map_id+0x200/0x200 [ 4712.003988] ? build_sec_desc+0x842/0x1dd0 [ksmbd] [ 4712.004000] kasan_report+0xaa/0x120 [ 4712.004045] ? build_sec_desc+0x842/0x1dd0 [ksmbd] [ 4712.004056] kasan_check_range+0x100/0x1e0 [ 4712.004060] memcpy+0x3c/0x60 [ 4712.004064] build_sec_desc+0x842/0x1dd0 [ksmbd] [ 4712.004076] ? parse_sec_desc+0x580/0x580 [ksmbd] [ 4712.004088] ? ksmbd_acls_fattr+0x281/0x410 [ksmbd] [ 4712.004099] smb2_query_info+0xa8f/0x6110 [ksmbd] [ 4712.004111] ? psi_group_change+0x856/0xd70 [ 4712.004148] ? update_load_avg+0x1c3/0x1af0 [ 4712.004152] ? asym_cpu_capacity_scan+0x5d0/0x5d0 [ 4712.004157] ? xas_load+0x23/0x300 [ 4712.004162] ? smb2_query_dir+0x1530/0x1530 [ksmbd] [ 4712.004173] ? _raw_spin_lock_bh+0xe0/0xe0 [ 4712.004179] handle_ksmbd_work+0x30e/0x1020 [ksmbd] [ 4712.004192] process_one_work+0x778/0x11c0 [ 4712.004227] ? _raw_spin_lock_irq+0x8e/0xe0 [ 4712.004231] worker_thread+0x544/0x1180 [ 4712.004234] ? __cpuidle_text_end+0x4/0x4 [ 4712.004239] kthread+0x282/0x320 [ 4712.004243] ? process_one_work+0x11c0/0x11c0 [ 4712.004246] ? kthread_complete_and_exit+0x30/0x30 [ 4712.004282] ret_from_fork+0x1f/0x30 This patch add the buffer validation for security descriptor that is stored by malformed SMB2_SET_INFO_HE command. and allocate large response buffer about SMB2_O_INFO_SECURITY file info class. Fixes: e2f34481b24d ("cifsd: add server-side procedures for SMB3") Cc: stable@vger.kernel.org Reported-by: zdi-disclosures@trendmicro.com # ZDI-CAN-17771 Reviewed-by: Hyunchul Lee <hyc.lee@gmail.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2022-08-01ksmbd: fix racy issue while destroying session on multichannelNamjae Jeon1-2/+1
After multi-channel connection with windows, Several channels of session are connected. Among them, if there is a problem in one channel, Windows connects again after disconnecting the channel. In this process, the session is released and a kernel oop can occurs while processing requests to other channels. When the channel is disconnected, if other channels still exist in the session after deleting the channel from the channel list in the session, the session should not be released. Finally, the session will be released after all channels are disconnected. Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Reviewed-by: Hyunchul Lee <hyc.lee@gmail.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2022-07-15acl: move idmapped mount fixup into vfs_{g,s}etxattr()Christian Brauner1-1/+1
This cycle we added support for mounting overlayfs on top of idmapped mounts. Recently I've started looking into potential corner cases when trying to add additional tests and I noticed that reporting for POSIX ACLs is currently wrong when using idmapped layers with overlayfs mounted on top of it. I'm going to give a rather detailed explanation to both the origin of the problem and the solution. Let's assume the user creates the following directory layout and they have a rootfs /var/lib/lxc/c1/rootfs. The files in this rootfs are owned as you would expect files on your host system to be owned. For example, ~/.bashrc for your regular user would be owned by 1000:1000 and /root/.bashrc would be owned by 0:0. IOW, this is just regular boring filesystem tree on an ext4 or xfs filesystem. The user chooses to set POSIX ACLs using the setfacl binary granting the user with uid 4 read, write, and execute permissions for their .bashrc file: setfacl -m u:4:rwx /var/lib/lxc/c2/rootfs/home/ubuntu/.bashrc Now they to expose the whole rootfs to a container using an idmapped mount. So they first create: mkdir -pv /vol/contpool/{ctrover,merge,lowermap,overmap} mkdir -pv /vol/contpool/ctrover/{over,work} chown 10000000:10000000 /vol/contpool/ctrover/{over,work} The user now creates an idmapped mount for the rootfs: mount-idmapped/mount-idmapped --map-mount=b:0:10000000:65536 \ /var/lib/lxc/c2/rootfs \ /vol/contpool/lowermap This for example makes it so that /var/lib/lxc/c2/rootfs/home/ubuntu/.bashrc which is owned by uid and gid 1000 as being owned by uid and gid 10001000 at /vol/contpool/lowermap/home/ubuntu/.bashrc. Assume the user wants to expose these idmapped mounts through an overlayfs mount to a container. mount -t overlay overlay \ -o lowerdir=/vol/contpool/lowermap, \ upperdir=/vol/contpool/overmap/over, \ workdir=/vol/contpool/overmap/work \ /vol/contpool/merge The user can do this in two ways: (1) Mount overlayfs in the initial user namespace and expose it to the container. (2) Mount overlayfs on top of the idmapped mounts inside of the container's user namespace. Let's assume the user chooses the (1) option and mounts overlayfs on the host and then changes into a container which uses the idmapping 0:10000000:65536 which is the same used for the two idmapped mounts. Now the user tries to retrieve the POSIX ACLs using the getfacl command getfacl -n /vol/contpool/lowermap/home/ubuntu/.bashrc and to their surprise they see: # file: vol/contpool/merge/home/ubuntu/.bashrc # owner: 1000 # group: 1000 user::rw- user:4294967295:rwx group::r-- mask::rwx other::r-- indicating the the uid wasn't correctly translated according to the idmapped mount. The problem is how we currently translate POSIX ACLs. Let's inspect the callchain in this example: idmapped mount /vol/contpool/merge: 0:10000000:65536 caller's idmapping: 0:10000000:65536 overlayfs idmapping (ofs->creator_cred): 0:0:4k /* initial idmapping */ sys_getxattr() -> path_getxattr() -> getxattr() -> do_getxattr() |> vfs_getxattr() | -> __vfs_getxattr() | -> handler->get == ovl_posix_acl_xattr_get() | -> ovl_xattr_get() | -> vfs_getxattr() | -> __vfs_getxattr() | -> handler->get() /* lower filesystem callback */ |> posix_acl_fix_xattr_to_user() { 4 = make_kuid(&init_user_ns, 4); 4 = mapped_kuid_fs(&init_user_ns /* no idmapped mount */, 4); /* FAILURE */ -1 = from_kuid(0:10000000:65536 /* caller's idmapping */, 4); } If the user chooses to use option (2) and mounts overlayfs on top of idmapped mounts inside the container things don't look that much better: idmapped mount /vol/contpool/merge: 0:10000000:65536 caller's idmapping: 0:10000000:65536 overlayfs idmapping (ofs->creator_cred): 0:10000000:65536 sys_getxattr() -> path_getxattr() -> getxattr() -> do_getxattr() |> vfs_getxattr() | -> __vfs_getxattr() | -> handler->get == ovl_posix_acl_xattr_get() | -> ovl_xattr_get() | -> vfs_getxattr() | -> __vfs_getxattr() | -> handler->get() /* lower filesystem callback */ |> posix_acl_fix_xattr_to_user() { 4 = make_kuid(&init_user_ns, 4); 4 = mapped_kuid_fs(&init_user_ns, 4); /* FAILURE */ -1 = from_kuid(0:10000000:65536 /* caller's idmapping */, 4); } As is easily seen the problem arises because the idmapping of the lower mount isn't taken into account as all of this happens in do_gexattr(). But do_getxattr() is always called on an overlayfs mount and inode and thus cannot possible take the idmapping of the lower layers into account. This problem is similar for fscaps but there the translation happens as part of vfs_getxattr() already. Let's walk through an fscaps overlayfs callchain: setcap 'cap_net_raw+ep' /var/lib/lxc/c2/rootfs/home/ubuntu/.bashrc The expected outcome here is that we'll receive the cap_net_raw capability as we are able to map the uid associated with the fscap to 0 within our container. IOW, we want to see 0 as the result of the idmapping translations. If the user chooses option (1) we get the following callchain for fscaps: idmapped mount /vol/contpool/merge: 0:10000000:65536 caller's idmapping: 0:10000000:65536 overlayfs idmapping (ofs->creator_cred): 0:0:4k /* initial idmapping */ sys_getxattr() -> path_getxattr() -> getxattr() -> do_getxattr() -> vfs_getxattr() -> xattr_getsecurity() -> security_inode_getsecurity() ________________________________ -> cap_inode_getsecurity() | | { V | 10000000 = make_kuid(0:0:4k /* overlayfs idmapping */, 10000000); | 10000000 = mapped_kuid_fs(0:0:4k /* no idmapped mount */, 10000000); | /* Expected result is 0 and thus that we own the fscap. */ | 0 = from_kuid(0:10000000:65536 /* caller's idmapping */, 10000000); | } | -> vfs_getxattr_alloc() | -> handler->get == ovl_other_xattr_get() | -> vfs_getxattr() | -> xattr_getsecurity() | -> security_inode_getsecurity() | -> cap_inode_getsecurity() | { | 0 = make_kuid(0:0:4k /* lower s_user_ns */, 0); | 10000000 = mapped_kuid_fs(0:10000000:65536 /* idmapped mount */, 0); | 10000000 = from_kuid(0:0:4k /* overlayfs idmapping */, 10000000); | |____________________________________________________________________| } -> vfs_getxattr_alloc() -> handler->get == /* lower filesystem callback */ And if the user chooses option (2) we get: idmapped mount /vol/contpool/merge: 0:10000000:65536 caller's idmapping: 0:10000000:65536 overlayfs idmapping (ofs->creator_cred): 0:10000000:65536 sys_getxattr() -> path_getxattr() -> getxattr() -> do_getxattr() -> vfs_getxattr() -> xattr_getsecurity() -> security_inode_getsecurity() _______________________________ -> cap_inode_getsecurity() | | { V | 10000000 = make_kuid(0:10000000:65536 /* overlayfs idmapping */, 0); | 10000000 = mapped_kuid_fs(0:0:4k /* no idmapped mount */, 10000000); | /* Expected result is 0 and thus that we own the fscap. */ | 0 = from_kuid(0:10000000:65536 /* caller's idmapping */, 10000000); | } | -> vfs_getxattr_alloc() | -> handler->get == ovl_other_xattr_get() | |-> vfs_getxattr() | -> xattr_getsecurity() | -> security_inode_getsecurity() | -> cap_inode_getsecurity() | { | 0 = make_kuid(0:0:4k /* lower s_user_ns */, 0); | 10000000 = mapped_kuid_fs(0:10000000:65536 /* idmapped mount */, 0); | 0 = from_kuid(0:10000000:65536 /* overlayfs idmapping */, 10000000); | |____________________________________________________________________| } -> vfs_getxattr_alloc() -> handler->get == /* lower filesystem callback */ We can see how the translation happens correctly in those cases as the conversion happens within the vfs_getxattr() helper. For POSIX ACLs we need to do something similar. However, in contrast to fscaps we cannot apply the fix directly to the kernel internal posix acl data structure as this would alter the cached values and would also require a rework of how we currently deal with POSIX ACLs in general which almost never take the filesystem idmapping into account (the noteable exception being FUSE but even there the implementation is special) and instead retrieve the raw values based on the initial idmapping. The correct values are then generated right before returning to userspace. The fix for this is to move taking the mount's idmapping into account directly in vfs_getxattr() instead of having it be part of posix_acl_fix_xattr_to_user(). To this end we split out two small and unexported helpers posix_acl_getxattr_idmapped_mnt() and posix_acl_setxattr_idmapped_mnt(). The former to be called in vfs_getxattr() and the latter to be called in vfs_setxattr(). Let's go back to the original example. Assume the user chose option (1) and mounted overlayfs on top of idmapped mounts on the host: idmapped mount /vol/contpool/merge: 0:10000000:65536 caller's idmapping: 0:10000000:65536 overlayfs idmapping (ofs->creator_cred): 0:0:4k /* initial idmapping */ sys_getxattr() -> path_getxattr() -> getxattr() -> do_getxattr() |> vfs_getxattr() | |> __vfs_getxattr() | | -> handler->get == ovl_posix_acl_xattr_get() | | -> ovl_xattr_get() | | -> vfs_getxattr() | | |> __vfs_getxattr() | | | -> handler->get() /* lower filesystem callback */ | | |> posix_acl_getxattr_idmapped_mnt() | | { | | 4 = make_kuid(&init_user_ns, 4); | | 10000004 = mapped_kuid_fs(0:10000000:65536 /* lower idmapped mount */, 4); | | 10000004 = from_kuid(&init_user_ns, 10000004); | | |_______________________ | | } | | | | | |> posix_acl_getxattr_idmapped_mnt() | | { | | V | 10000004 = make_kuid(&init_user_ns, 10000004); | 10000004 = mapped_kuid_fs(&init_user_ns /* no idmapped mount */, 10000004); | 10000004 = from_kuid(&init_user_ns, 10000004); | } |_________________________________________________ | | | | |> posix_acl_fix_xattr_to_user() | { V 10000004 = make_kuid(0:0:4k /* init_user_ns */, 10000004); /* SUCCESS */ 4 = from_kuid(0:10000000:65536 /* caller's idmapping */, 10000004); } And similarly if the user chooses option (1) and mounted overayfs on top of idmapped mounts inside the container: idmapped mount /vol/contpool/merge: 0:10000000:65536 caller's idmapping: 0:10000000:65536 overlayfs idmapping (ofs->creator_cred): 0:10000000:65536 sys_getxattr() -> path_getxattr() -> getxattr() -> do_getxattr() |> vfs_getxattr() | |> __vfs_getxattr() | | -> handler->get == ovl_posix_acl_xattr_get() | | -> ovl_xattr_get() | | -> vfs_getxattr() | | |> __vfs_getxattr() | | | -> handler->get() /* lower filesystem callback */ | | |> posix_acl_getxattr_idmapped_mnt() | | { | | 4 = make_kuid(&init_user_ns, 4); | | 10000004 = mapped_kuid_fs(0:10000000:65536 /* lower idmapped mount */, 4); | | 10000004 = from_kuid(&init_user_ns, 10000004); | | |_______________________ | | } | | | | | |> posix_acl_getxattr_idmapped_mnt() | | { V | 10000004 = make_kuid(&init_user_ns, 10000004); | 10000004 = mapped_kuid_fs(&init_user_ns /* no idmapped mount */, 10000004); | 10000004 = from_kuid(0(&init_user_ns, 10000004); | |_________________________________________________ | } | | | |> posix_acl_fix_xattr_to_user() | { V 10000004 = make_kuid(0:0:4k /* init_user_ns */, 10000004); /* SUCCESS */ 4 = from_kuid(0:10000000:65536 /* caller's idmappings */, 10000004); } The last remaining problem we need to fix here is ovl_get_acl(). During ovl_permission() overlayfs will call: ovl_permission() -> generic_permission() -> acl_permission_check() -> check_acl() -> get_acl() -> inode->i_op->get_acl() == ovl_get_acl() > get_acl() /* on the underlying filesystem) ->inode->i_op->get_acl() == /*lower filesystem callback */ -> posix_acl_permission() passing through the get_acl request to the underlying filesystem. This will retrieve the acls stored in the lower filesystem without taking the idmapping of the underlying mount into account as this would mean altering the cached values for the lower filesystem. So we block using ACLs for now until we decided on a nice way to fix this. Note this limitation both in the documentation and in the code. The most straightforward solution would be to have ovl_get_acl() simply duplicate the ACLs, update the values according to the idmapped mount and return it to acl_permission_check() so it can be used in posix_acl_permission() forgetting them afterwards. This is a bit heavy handed but fairly straightforward otherwise. Link: https://github.com/brauner/mount-idmapped/issues/9 Link: https://lore.kernel.org/r/20220708090134.385160-2-brauner@kernel.org Cc: Seth Forshee <sforshee@digitalocean.com> Cc: Amir Goldstein <amir73il@gmail.com> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Aleksa Sarai <cyphar@cyphar.com> Cc: Miklos Szeredi <mszeredi@redhat.com> Cc: linux-unionfs@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org Reviewed-by: Seth Forshee <sforshee@digitalocean.com> Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
2022-07-01vfs: fix copy_file_range() regression in cross-fs copiesAmir Goldstein1-0/+4
A regression has been reported by Nicolas Boichat, found while using the copy_file_range syscall to copy a tracefs file. Before commit 5dae222a5ff0 ("vfs: allow copy_file_range to copy across devices") the kernel would return -EXDEV to userspace when trying to copy a file across different filesystems. After this commit, the syscall doesn't fail anymore and instead returns zero (zero bytes copied), as this file's content is generated on-the-fly and thus reports a size of zero. Another regression has been reported by He Zhe - the assertion of WARN_ON_ONCE(ret == -EOPNOTSUPP) can be triggered from userspace when copying from a sysfs file whose read operation may return -EOPNOTSUPP. Since we do not have test coverage for copy_file_range() between any two types of filesystems, the best way to avoid these sort of issues in the future is for the kernel to be more picky about filesystems that are allowed to do copy_file_range(). This patch restores some cross-filesystem copy restrictions that existed prior to commit 5dae222a5ff0 ("vfs: allow copy_file_range to copy across devices"), namely, cross-sb copy is not allowed for filesystems that do not implement ->copy_file_range(). Filesystems that do implement ->copy_file_range() have full control of the result - if this method returns an error, the error is returned to the user. Before this change this was only true for fs that did not implement the ->remap_file_range() operation (i.e. nfsv3). Filesystems that do not implement ->copy_file_range() still fall-back to the generic_copy_file_range() implementation when the copy is within the same sb. This helps the kernel can maintain a more consistent story about which filesystems support copy_file_range(). nfsd and ksmbd servers are modified to fall-back to the generic_copy_file_range() implementation in case vfs_copy_file_range() fails with -EOPNOTSUPP or -EXDEV, which preserves behavior of server-side-copy. fall-back to generic_copy_file_range() is not implemented for the smb operation FSCTL_DUPLICATE_EXTENTS_TO_FILE, which is arguably a correct change of behavior. Fixes: 5dae222a5ff0 ("vfs: allow copy_file_range to copy across devices") Link: https://lore.kernel.org/linux-fsdevel/20210212044405.4120619-1-drinkcat@chromium.org/ Link: https://lore.kernel.org/linux-fsdevel/CANMq1KDZuxir2LM5jOTm0xx+BnvW=ZmpsG47CyHFJwnw7zSX6Q@mail.gmail.com/ Link: https://lore.kernel.org/linux-fsdevel/20210126135012.1.If45b7cdc3ff707bc1efa17f5366057d60603c45f@changeid/ Link: https://lore.kernel.org/linux-fsdevel/20210630161320.29006-1-lhenriques@suse.de/ Reported-by: Nicolas Boichat <drinkcat@chromium.org> Reported-by: kernel test robot <oliver.sang@intel.com> Signed-off-by: Luis Henriques <lhenriques@suse.de> Fixes: 64bf5ff58dff ("vfs: no fallback for ->copy_file_range") Link: https://lore.kernel.org/linux-fsdevel/20f17f64-88cb-4e80-07c1-85cb96c83619@windriver.com/ Reported-by: He Zhe <zhe.he@windriver.com> Tested-by: Namjae Jeon <linkinjeon@kernel.org> Tested-by: Luis Henriques <lhenriques@suse.de> Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-06-26ksmbd: use vfs_llseek instead of dereferencing NULLJason A. Donenfeld1-2/+2
By not checking whether llseek is NULL, this might jump to NULL. Also, it doesn't check FMODE_LSEEK. Fix this by using vfs_llseek(), which always does the right thing. Fixes: f44158485826 ("cifsd: add file operations") Cc: stable@vger.kernel.org Cc: linux-cifs@vger.kernel.org Cc: Ronnie Sahlberg <lsahlber@redhat.com> Cc: Hyunchul Lee <hyc.lee@gmail.com> Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Reviewed-by: Namjae Jeon <linkinjeon@kernel.org> Acked-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2022-06-24ksmbd: set the range of bytes to zero without extending file size in ↵Namjae Jeon1-1/+3
FSCTL_ZERO_DATA generic/091, 263 test failed since commit f66f8b94e7f2 ("cifs: when extending a file with falloc we should make files not-sparse"). FSCTL_ZERO_DATA sets the range of bytes to zero without extending file size. The VFS_FALLOCATE_FL_KEEP_SIZE flag should be used even on non-sparse files. Cc: stable@vger.kernel.org Reviewed-by: Hyunchul Lee <hyc.lee@gmail.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2022-04-15ksmbd: remove filename in ksmbd_fileNamjae Jeon1-4/+2
If the filename is change by underlying rename the server, fp->filename and real filename can be different. This patch remove the uses of fp->filename in ksmbd and replace it with d_path(). Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Reviewed-by: Hyunchul Lee <hyc.lee@gmail.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2022-02-02block: remove genhd.hChristoph Hellwig1-1/+0
There is no good reason to keep genhd.h separate from the main blkdev.h header that includes it. So fold the contents of genhd.h into blkdev.h and remove genhd.h entirely. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Link: https://lore.kernel.org/r/20220124093913.742411-4-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-11-12ksmbd: Use the SMB3_Create definitions from the sharedRonnie Sahlberg1-4/+4
Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2021-10-14ksmbd: add validation in smb2_ioctlNamjae Jeon1-1/+1
Add validation for request/response buffer size check in smb2_ioctl and fsctl_copychunk() take copychunk_ioctl_req pointer and the other arguments instead of smb2_ioctl_req structure and remove an unused smb2_ioctl_req argument of fsctl_validate_negotiate_info. Cc: Tom Talpey <tom@talpey.com> Cc: Ronnie Sahlberg <ronniesahlberg@gmail.com> Cc: Ralph Böhme <slow@samba.org> Cc: Steve French <smfrench@gmail.com> Cc: Sergey Senozhatsky <senozhatsky@chromium.org> Acked-by: Hyunchul Lee <hyc.lee@gmail.com> Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2021-09-25ksmbd: use LOOKUP_BENEATH to prevent the out of share accessHyunchul Lee1-74/+82
instead of removing '..' in a given path, call kern_path with LOOKUP_BENEATH flag to prevent the out of share access. ran various test on this: smb2-cat-async smb://127.0.0.1/homes/../out_of_share smb2-cat-async smb://127.0.0.1/homes/foo/../../out_of_share smbclient //127.0.0.1/homes -c "mkdir ../foo2" smbclient //127.0.0.1/homes -c "rename bar ../bar" Cc: Ronnie Sahlberg <ronniesahlberg@gmail.com> Cc: Ralph Boehme <slow@samba.org> Tested-by: Steve French <smfrench@gmail.com> Tested-by: Namjae Jeon <linkinjeon@kernel.org> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Hyunchul Lee <hyc.lee@gmail.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2021-09-23ksmbd: remove follow symlinks supportNamjae Jeon1-23/+9
Use LOOKUP_NO_SYMLINKS flags for default lookup to prohibit the middle of symlink component lookup and remove follow symlinks parameter support. We re-implement it as reparse point later. Test result: smbclient -Ulinkinjeon%1234 //172.30.1.42/share -c "get hacked/passwd passwd" NT_STATUS_OBJECT_NAME_NOT_FOUND opening remote file \hacked\passwd Cc: Ralph Böhme <slow@samba.org> Cc: Steve French <smfrench@gmail.com> Acked-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2021-09-04ksmbd: fix translation in acl entriesChristian Brauner1-2/+2
The ksmbd server performs translation of posix acls to smb acls. Currently the translation is wrong since the idmapping of the mount is used to map the ids into raw userspace ids but what is relevant is the user namespace of ksmbd itself. The user namespace of ksmbd itself which is the initial user namespace. The operation is similar to asking "What *ids would a userspace process see given that k*id in the relevant user namespace?". Before the final translation we need to apply the idmapping of the mount in case any is used. Add two simple helpers for ksmbd. Cc: Steve French <stfrench@microsoft.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: Namjae Jeon <namjae.jeon@samsung.com> Cc: Hyunchul Lee <hyc.lee@gmail.com> Cc: Sergey Senozhatsky <senozhatsky@chromium.org> Cc: linux-cifs@vger.kernel.org Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2021-09-04ksmbd: fix lookup on idmapped mountsChristian Brauner1-19/+24
It's great that the new in-kernel ksmbd server will support idmapped mounts out of the box! However, lookup is currently broken. Lookup helpers such as lookup_one_len() call inode_permission() internally to ensure that the caller is privileged over the inode of the base dentry they are trying to lookup under. So the permission checking here is currently wrong. Linux v5.15 will gain a new lookup helper lookup_one() that does take idmappings into account. I've added it as part of my patch series to make btrfs support idmapped mounts. The new helper is in linux-next as part of David's (Sterba) btrfs for-next branch as commit c972214c133b ("namei: add mapping aware lookup helper"). I've said it before during one of my first reviews: I would very much recommend adding fstests to [1]. It already seems to have very rudimentary cifs support. There is a completely generic idmapped mount testsuite that supports idmapped mounts. [1]: https://git.kernel.org/pub/scm/fs/xfs/xfsprogs-dev.git/ Cc: Colin Ian King <colin.king@canonical.com> Cc: Steve French <stfrench@microsoft.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: Namjae Jeon <namjae.jeon@samsung.com> Cc: Hyunchul Lee <hyc.lee@gmail.com> Cc: Sergey Senozhatsky <senozhatsky@chromium.org> Cc: David Sterba <dsterba@suse.com> Cc: linux-cifs@vger.kernel.org Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2021-08-13ksmbd: remove select FS_POSIX_ACL in KconfigNamjae Jeon1-0/+9
ksmbd is forcing to turn on FS_POSIX_ACL in Kconfig to use vfs acl functions(posix_acl_alloc, get_acl, set_posix_acl). OpenWRT and other platform doesn't use acl and this config is disable by default in kernel. This patch use IS_ENABLED() to know acl config is enable and use acl function if it is enable. Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2021-07-13ksmbd: fix memory leak in ksmbd_vfs_get_sd_xattr()Namjae Jeon1-49/+52
Add free acl.sd_buf and n.data on error handling in ksmbd_vfs_get_sd_xattr(). Reported-by: Coverity Scan <scan-admin@coverity.com> Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2021-07-10ksmbd: uninterruptible wait for a file being unlockedHyunchul Lee1-2/+2
the wait can be canceled by SMB2_CANCEL, SMB2_CLOSE, SMB2_LOGOFF, disconnection or shutdown, we don't have to use wait_event_interruptible. And this remove the warning from Coverity: CID 1502834 (#1 of 1): Unused value (UNUSED_VALUE) returned_value: Assigning value from ksmbd_vfs_posix_lock_wait(flock) to err here, but that stored value is overwritten before it can be used. Signed-off-by: Hyunchul Lee <hyc.lee@gmail.com> Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2021-07-10ksmbd: use kasprintf() in ksmbd_vfs_xattr_stream_name()Dan Carpenter1-21/+6
Simplify the code by using kasprintf(). This also silences a Smatch warning: fs/ksmbd/vfs.c:1725 ksmbd_vfs_xattr_stream_name() warn: inconsistent indenting Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2021-07-05ksmbd: call mnt_user_ns once in a functionHyunchul Lee1-2/+3
Avoid calling mnt_user_ns() many time in a function. Cc: Christoph Hellwig <hch@infradead.org> Cc: Christian Brauner <christian@brauner.io> Signed-off-by: Hyunchul Lee <hyc.lee@gmail.com> Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2021-07-02ksmbd: add user namespace supportHyunchul Lee1-65/+107
For user namespace support, call vfs functions with struct user_namespace got from struct path. This patch have been tested mannually as below. Create an id-mapped mount using the mount-idmapped utility (https://github.com/brauner/mount-idmapped). $ mount-idmapped --map-mount b:1003:1002:1 /home/foo <EXPORT DIR>/foo (the user, "foo" is 1003, and the user "bar" is 1002). And mount the export directory using cifs with the user, "bar". succeed to create/delete/stat/read/write files and directory in the <EXPORT DIR>/foo. But fail with a bind mount for /home/foo. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Hyunchul Lee <hyc.lee@gmail.com> Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2021-06-29ksmbd: use ksmbd_vfs_lock_parent to get stable parent dentryNamjae Jeon1-1/+1
Use ksmbd_vfs_lock_parent to get stable parent dentry and remove PARENT_INODE macro. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2021-06-29ksmbd: opencode to remove FP_INODE macroNamjae Jeon1-1/+1
Opencode to remove FP_INODE macro. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2021-06-29ksmbd: fix dentry racy with rename()Namjae Jeon1-6/+8
Using ->d_name can be broken due to races with rename(). So use %pd with ->d_name to print filename and In other cases, use it under ->d_lock. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2021-06-28ksmbd: set MAY_* flags together with open flagsHyunchul Lee1-29/+13
set MAY_* flags together with open flags and remove ksmbd_vfs_inode_permission(). Signed-off-by: Hyunchul Lee <hyc.lee@gmail.com> Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2021-06-28ksmbd: factor out a ksmbd_vfs_lock_parent helperHyunchul Lee1-63/+62
Factor out a self-contained helper to get stable parent dentry. Signed-off-by: Hyunchul Lee <hyc.lee@gmail.com> Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2021-06-28ksmbd: move fs/cifsd to fs/ksmbdNamjae Jeon1-0/+1870
Move fs/cifsd to fs/ksmbd and rename the remaining cifsd name to ksmbd. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Steve French <stfrench@microsoft.com>