summaryrefslogtreecommitdiff
path: root/fs
AgeCommit message (Collapse)AuthorFilesLines
2015-08-31nfsd: SUPPATTR_EXCLCREAT must be encoded before SECURITY_LABEL.Kinglong Mee1-6/+7
The encode order should be as the bitmask defined order. Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-08-31nfsd: Fix an FS_LAYOUT_TYPES/LAYOUT_TYPES encode bugKinglong Mee1-14/+31
Currently we'll respond correctly to a request for either FS_LAYOUT_TYPES or LAYOUT_TYPES, but not to a request for both attributes simultaneously. Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: stable@vger.kernel.org Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-08-31NFSD: Store parent's stat in a separate valueKinglong Mee1-3/+10
After commit ae7095a7c4 (nfsd4: helper function for getting mounted_on ino) we ignore the return value from get_parent_attributes(). Also, the following FATTR4_WORD2_LAYOUT_BLKSIZE uses stat.blksize, so to avoid overwriting that, use an independent value for the parent's attributes. Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-08-13nfsd: Fix two typos in commentsAndreas Gruenbacher1-2/+2
(espect -> expect) and (no -> know) Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-08-13lockd: NLM grace period shouldn't block NFSv4 opensJ. Bruce Fields4-9/+31
NLM locks don't conflict with NFSv4 share reservations, so we're not going to learn anything new by watiting for them. They do conflict with NFSv4 locks and with delegations. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-08-13nfsd: include linux/nfs4.h in export.hJeff Layton1-0/+1
export.h refers to the pnfs_layouttype enum, which is defined there. Signed-off-by: Jeff Layton <jeff.layton@primarydata.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-08-13sunrpc/nfsd: Remove redundant code by exports seq_operations functionsKinglong Mee1-70/+3
Nfsd has implement a site of seq_operations functions as sunrpc's cache. Just exports sunrpc's codes, and remove nfsd's redundant codes. v8, same as v6 Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Reviewed-by: NeilBrown <neilb@suse.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-08-13nfsd: New helper nfsd4_cb_sequence_done() for processing more cb errorsKinglong Mee2-36/+88
According to Christoph's advice, this patch introduce a new helper nfsd4_cb_sequence_done() for processing more callback errors, following the example of the client's nfs41_sequence_done(). Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-08-10Merge branch 'for-4.2' into for-4.3J. Bruce Fields1-6/+6
2015-08-10nfsd: Remove unused clientid arguments from, find_lockowner_str{_locked}Kinglong Mee1-10/+6
Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-08-10nfsd: Use lk_new_xxx instead of v.new.xxx for nfs4_lockownerKinglong Mee1-4/+4
Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-08-10nfsd: Remove macro LOFF_OVERFLOWKinglong Mee1-5/+2
Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-08-10nfsd: Remove duplicate checking of nfsd_net in nfs4_laundromat()Kinglong Mee1-2/+0
Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-08-10nfsd: Remove unused values in nfs4_setlease()Kinglong Mee1-2/+1
Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-08-10nfsd: Remove nfs4_set_claim_prev()Kinglong Mee1-7/+1
Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-08-10nfsd: Drop duplicate checking of seqid in nfsd4_create_session()Kinglong Mee1-5/+3
Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-08-10nfsd: Remove unneeded values in nfsd4_open()Kinglong Mee1-3/+1
Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-08-10nfsd: Add missing gen_confirm in nfsd4_setclientid()Kinglong Mee1-2/+3
Commit 294ac32e99 "nfsd: protect clid and verifier generation with client_lock" moved gen_confirm() to gen_clid(). After that commit, setclientid will return a bad reply with all-zero verifier after copy_clid(). Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-08-10nfsd: New counter for generating client confirm verifierKinglong Mee2-1/+2
If using clientid_counter, it seems possible that gen_confirm could generate the same verifier for the same client in some situations. Add a new counter for client confirm verifier to make sure gen_confirm generates a different verifier on each call for the same clientid. Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Reviewed-by: Jeff Layton <jlayton@poochiereds.net> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-08-10nfsd: Fix memory leak of so_owner.data in nfs4_stateownerKinglong Mee1-4/+11
v2, new helper nfs4_free_stateowner for freeing so_owner.data and sop Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-08-10nfsd: Add layouts checking in client_has_state()Kinglong Mee1-0/+3
Layout is a state resource, nfsd should check it too. v2, drop unneeded updating in nfsd4_renew() v3, fix compile error without CONFIG_NFSD_PNFS Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-08-10nfsd: Fix a memory leak of struct file_lockKinglong Mee1-0/+1
Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-08-10nfsd/sunrpc: abstract out svc_set_num_threads to sv_opsJeff Layton1-3/+5
Add an operation that will do setup of the service. In the case of a classic thread-based service that means starting up threads. In the case of a workqueue-based service, the setup will do something different. Signed-off-by: Shirley Ma <shirley.ma@oracle.com> Acked-by: Jeff Layton <jlayton@primarydata.com> Tested-by: Shirley Ma <shirliey.ma@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-08-10nfsd/sunrpc: turn enqueueing a svc_xprt into a svc_serv operationJeff Layton3-6/+9
For now, all services use svc_xprt_do_enqueue, but once we add workqueue-based service support, we'll need to do something different. Signed-off-by: Shirley Ma <shirley.ma@oracle.com> Acked-by: Jeff Layton <jlayton@primarydata.com> Tested-by: Shirley Ma <shirley.ma@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-08-10nfsd/sunrpc: move sv_module parm into sv_opsJeff Layton1-1/+2
...not technically an operation, but it's more convenient and cleaner to pass the module pointer in this struct. Signed-off-by: Shirley Ma <shirley.ma@oracle.com> Acked-by: Jeff Layton <jlayton@primarydata.com> Tested-by: Shirley Ma <shirley.ma@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-08-10nfsd/sunrpc: move sv_function into sv_opsJeff Layton1-1/+2
Since we now have a container for holding svc_serv operations, move the sv_function into it as well. Signed-off-by: Shirley Ma <shirley.ma@oracle.com> Acked-by: Jeff Layton <jlayton@primarydata.com> Tested-by: Shirley Ma <shirley.ma@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-08-10nfsd/sunrpc: add a new svc_serv_ops struct and move sv_shutdown into itJeff Layton3-3/+14
In later patches we'll need to abstract out more operations on a per-service level, besides sv_shutdown and sv_function. Declare a new svc_serv_ops struct to hold these operations, and move sv_shutdown into this struct. Signed-off-by: Shirley Ma <shirley.ma@oracle.com> Acked-by: Jeff Layton <jlayton@primarydata.com> Tested-by: Shirley Ma <shirley.ma@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-07-31nfsd: do nfs4_check_fh in nfs4_check_file instead of nfs4_check_olstateidJeff Layton1-6/+6
Currently, preprocess_stateid_op calls nfs4_check_olstateid which verifies that the open stateid corresponds to the current filehandle in the call by calling nfs4_check_fh. If the stateid is a NFS4_DELEG_STID however, then no such check is done. This could cause incorrect enforcement of permissions, because the nfsd_permission() call in nfs4_check_file uses current the current filehandle, but any subsequent IO operation will use the file descriptor in the stateid. Move the call to nfs4_check_fh into nfs4_check_file instead so that it can be done for all stateid types. Signed-off-by: Jeff Layton <jeff.layton@primarydata.com> Cc: stable@vger.kernel.org [bfields: moved fh check to avoid NULL deref in special stateid case] Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-07-20nfsd: Add macro NFS_ACL_MASK for ACLKinglong Mee2-8/+6
Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-07-20nfsd: Remove duplicate define of IDMAP_NAMESZ/IDMAP_TYPE_xxKinglong Mee2-6/+1
Just using the macro defined in nfs_idmap.h. Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-07-20nfsd: Drop including client's header file nfs_fs.hKinglong Mee1-1/+3
nfs_fs.h is a client's header file. # ll fs/nfsd/nfs4acl.o fs/nfsd/nfsd.ko -rw-r--r--. 1 root root 328248 Jul 3 19:26 fs/nfsd/nfs4acl.o -rw-r--r--. 1 root root 7452016 Jul 3 19:26 fs/nfsd/nfsd.ko After this patch, # ll fs/nfsd/nfs4acl.o fs/nfsd/nfsd.ko -rw-r--r--. 1 root root 150872 Jul 3 19:15 fs/nfsd/nfs4acl.o -rw-r--r--. 1 root root 7273792 Jul 3 19:23 fs/nfsd/nfsd.ko Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-07-20nfsd: Set lc_size_chg before ops->proc_layoutcommitKinglong Mee1-5/+1
After proc_layoutcommit success, i_size_read(inode) always >= new_size. Just set lc_size_chg before proc_layoutcommit, if proc_layoutcommit failed, nfsd will skip the lc_size_chg, so it's no harm. Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-07-20nfsd: Fix a memory leak in nfsd4_list_rec_dir()Kinglong Mee1-3/+9
If lookup_one_len() failed, nfsd should free those memory allocated for fname. Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-07-20nfsd: Fix a file leak on nfsd4_layout_setlease failureKinglong Mee1-0/+1
If nfsd4_layout_setlease fails, nfsd will not put ls->ls_file. Fix commit c5c707f96f "nfsd: implement pNFS layout recalls". Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-07-20nfsd: Drop BUG_ON and ignore SECLABEL on absent filesystemKinglong Mee1-5/+6
On an absent filesystem (one served by another server), we need to be able to handle requests for certain attributest (like fs_locations, so the client can find out which server does have the filesystem), but others we can't. We forgot to take that into account when adding another attribute bitmask work for the SECURITY_LABEL attribute. There an export entry with the "refer" option can result in: [ 88.414272] kernel BUG at fs/nfsd/nfs4xdr.c:2249! [ 88.414828] invalid opcode: 0000 [#1] SMP [ 88.415368] Modules linked in: rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache nfsd xfs libcrc32c iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi iosf_mbi ppdev btrfs coretemp crct10dif_pclmul crc32_pclmul crc32c_intel xor ghash_clmulni_intel raid6_pq vmw_balloon parport_pc parport i2c_piix4 shpchp vmw_vmci acpi_cpufreq auth_rpcgss nfs_acl lockd grace sunrpc vmwgfx drm_kms_helper ttm drm mptspi mptscsih serio_raw mptbase e1000 scsi_transport_spi ata_generic pata_acpi [last unloaded: nfsd] [ 88.417827] CPU: 0 PID: 2116 Comm: nfsd Not tainted 4.0.7-300.fc22.x86_64 #1 [ 88.418448] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 05/20/2014 [ 88.419093] task: ffff880079146d50 ti: ffff8800785d8000 task.ti: ffff8800785d8000 [ 88.419729] RIP: 0010:[<ffffffffa04b3c10>] [<ffffffffa04b3c10>] nfsd4_encode_fattr+0x820/0x1f00 [nfsd] [ 88.420376] RSP: 0000:ffff8800785db998 EFLAGS: 00010206 [ 88.421027] RAX: 0000000000000001 RBX: 000000000018091a RCX: ffff88006668b980 [ 88.421676] RDX: 00000000fffef7fc RSI: 0000000000000000 RDI: ffff880078d05000 [ 88.422315] RBP: ffff8800785dbb58 R08: ffff880078d043f8 R09: ffff880078d4a000 [ 88.422968] R10: 0000000000010000 R11: 0000000000000002 R12: 0000000000b0a23a [ 88.423612] R13: ffff880078d05000 R14: ffff880078683100 R15: ffff88006668b980 [ 88.424295] FS: 0000000000000000(0000) GS:ffff88007c600000(0000) knlGS:0000000000000000 [ 88.424944] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 88.425597] CR2: 00007f40bc370f90 CR3: 0000000035af5000 CR4: 00000000001407f0 [ 88.426285] Stack: [ 88.426921] ffff8800785dbaa8 ffffffffa049e4af ffff8800785dba08 ffffffff813298f0 [ 88.427585] ffff880078683300 ffff8800769b0de8 0000089d00000001 0000000087f805e0 [ 88.428228] ffff880000000000 ffff880079434a00 0000000000000000 ffff88006668b980 [ 88.428877] Call Trace: [ 88.429527] [<ffffffffa049e4af>] ? exp_get_by_name+0x7f/0xb0 [nfsd] [ 88.430168] [<ffffffff813298f0>] ? inode_doinit_with_dentry+0x210/0x6a0 [ 88.430807] [<ffffffff8123833e>] ? d_lookup+0x2e/0x60 [ 88.431449] [<ffffffff81236133>] ? dput+0x33/0x230 [ 88.432097] [<ffffffff8123f214>] ? mntput+0x24/0x40 [ 88.432719] [<ffffffff812272b2>] ? path_put+0x22/0x30 [ 88.433340] [<ffffffffa049ac87>] ? nfsd_cross_mnt+0xb7/0x1c0 [nfsd] [ 88.433954] [<ffffffffa04b54e0>] nfsd4_encode_dirent+0x1b0/0x3d0 [nfsd] [ 88.434601] [<ffffffffa04b5330>] ? nfsd4_encode_getattr+0x40/0x40 [nfsd] [ 88.435172] [<ffffffffa049c991>] nfsd_readdir+0x1c1/0x2a0 [nfsd] [ 88.435710] [<ffffffffa049a530>] ? nfsd_direct_splice_actor+0x20/0x20 [nfsd] [ 88.436447] [<ffffffffa04abf30>] nfsd4_encode_readdir+0x120/0x220 [nfsd] [ 88.437011] [<ffffffffa04b58cd>] nfsd4_encode_operation+0x7d/0x190 [nfsd] [ 88.437566] [<ffffffffa04aa6dd>] nfsd4_proc_compound+0x24d/0x6f0 [nfsd] [ 88.438157] [<ffffffffa0496103>] nfsd_dispatch+0xc3/0x220 [nfsd] [ 88.438680] [<ffffffffa006f0cb>] svc_process_common+0x43b/0x690 [sunrpc] [ 88.439192] [<ffffffffa0070493>] svc_process+0x103/0x1b0 [sunrpc] [ 88.439694] [<ffffffffa0495a57>] nfsd+0x117/0x190 [nfsd] [ 88.440194] [<ffffffffa0495940>] ? nfsd_destroy+0x90/0x90 [nfsd] [ 88.440697] [<ffffffff810bb728>] kthread+0xd8/0xf0 [ 88.441260] [<ffffffff810bb650>] ? kthread_worker_fn+0x180/0x180 [ 88.441762] [<ffffffff81789e58>] ret_from_fork+0x58/0x90 [ 88.442322] [<ffffffff810bb650>] ? kthread_worker_fn+0x180/0x180 [ 88.442879] Code: 0f 84 93 05 00 00 83 f8 ea c7 85 a0 fe ff ff 00 00 27 30 0f 84 ba fe ff ff 85 c0 0f 85 a5 fe ff ff e9 e3 f9 ff ff 0f 1f 44 00 00 <0f> 0b 66 0f 1f 44 00 00 be 04 00 00 00 4c 89 ef 4c 89 8d 68 fe [ 88.444052] RIP [<ffffffffa04b3c10>] nfsd4_encode_fattr+0x820/0x1f00 [nfsd] [ 88.444658] RSP <ffff8800785db998> [ 88.445232] ---[ end trace 6cb9d0487d94a29f ]--- Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Cc: stable@vger.kernel.org Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-07-18Merge branch 'x86-urgent-for-linus' of ↵Linus Torvalds1-2/+2
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Ingo Molnar: "Two families of fixes: - Fix an FPU context related boot crash on newer x86 hardware with larger context sizes than what most people test. To fix this without ugly kludges or extensive reverts we had to touch core task allocator, to allow x86 to determine the task size dynamically, at boot time. I've tested it on a number of x86 platforms, and I cross-built it to a handful of architectures: (warns) (warns) testing x86-64: -git: pass ( 0), -tip: pass ( 0) testing x86-32: -git: pass ( 0), -tip: pass ( 0) testing arm: -git: pass ( 1359), -tip: pass ( 1359) testing cris: -git: pass ( 1031), -tip: pass ( 1031) testing m32r: -git: pass ( 1135), -tip: pass ( 1135) testing m68k: -git: pass ( 1471), -tip: pass ( 1471) testing mips: -git: pass ( 1162), -tip: pass ( 1162) testing mn10300: -git: pass ( 1058), -tip: pass ( 1058) testing parisc: -git: pass ( 1846), -tip: pass ( 1846) testing sparc: -git: pass ( 1185), -tip: pass ( 1185) ... so I hope the cross-arch impact 'none', as intended. (by Dave Hansen) - Fix various NMI handling related bugs unearthed by the big asm code rewrite and generally make the NMI code more robust and more maintainable while at it. These changes are a bit late in the cycle, I hope they are still acceptable. (by Andy Lutomirski)" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/fpu, sched: Introduce CONFIG_ARCH_WANTS_DYNAMIC_TASK_STRUCT and use it on x86 x86/fpu, sched: Dynamically allocate 'struct fpu' x86/entry/64, x86/nmi/64: Add CONFIG_DEBUG_ENTRY NMI testing code x86/nmi/64: Make the "NMI executing" variable more consistent x86/nmi/64: Minor asm simplification x86/nmi/64: Use DF to avoid userspace RSP confusing nested NMI detection x86/nmi/64: Reorder nested NMI checks x86/nmi/64: Improve nested NMI comments x86/nmi/64: Switch stacks on userspace NMI entry x86/nmi/64: Remove asm code that saves CR2 x86/nmi: Enable nested do_nmi() handling for 64-bit kernels
2015-07-18Merge branch 'akpm' (patches from Andrew)Linus Torvalds4-22/+27
Merge fixes from Andrew Morton: "25 fixes" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (25 commits) lib/decompress: set the compressor name to NULL on error mm/cma_debug: correct size input to bitmap function mm/cma_debug: fix debugging alloc/free interface mm/page_owner: set correct gfp_mask on page_owner mm/page_owner: fix possible access violation fsnotify: fix oops in fsnotify_clear_marks_by_group_flags() /proc/$PID/cmdline: fixup empty ARGV case dma-debug: skip debug_dma_assert_idle() when disabled hexdump: fix for non-aligned buffers checkpatch: fix long line messages about patch context mm: clean up per architecture MM hook header files MAINTAINERS: uclinux-h8-devel is moderated for non-subscribers mailmap: update Sudeep Holla's email id Update Viresh Kumar's email address mm, meminit: suppress unused memory variable warning configfs: fix kernel infoleak through user-controlled format string include, lib: add __printf attributes to several function prototypes s390/hugetlb: add hugepages_supported define mm: hugetlb: allow hugepages_supported to be architecture specific revert "s390/mm: make hugepages_supported a boot time decision" ...
2015-07-18Merge branch 'for-linus-4.2' of ↵Linus Torvalds4-6/+34
git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs Pull btrfs fixes from Chris Mason: "These are all from Filipe, and cover a few problems we've had reported on the list recently (along with ones he found on his own)" * 'for-linus-4.2' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: Btrfs: fix file corruption after cloning inline extents Btrfs: fix order by which delayed references are run Btrfs: fix list transaction->pending_ordered corruption Btrfs: fix memory leak in the extent_same ioctl Btrfs: fix shrinking truncate when the no_holes feature is enabled
2015-07-18x86/fpu, sched: Introduce CONFIG_ARCH_WANTS_DYNAMIC_TASK_STRUCT and use it ↵Ingo Molnar1-2/+2
on x86 Don't burden architectures without dynamic task_struct sizing with the overhead of dynamic sizing. Also optimize the x86 code a bit by caching task_struct_size. Acked-and-Tested-by: Dave Hansen <dave.hansen@linux.intel.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave@sr71.net> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1437128892-9831-3-git-send-email-mingo@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-07-18x86/fpu, sched: Dynamically allocate 'struct fpu'Dave Hansen1-2/+2
The FPU rewrite removed the dynamic allocations of 'struct fpu'. But, this potentially wastes massive amounts of memory (2k per task on systems that do not have AVX-512 for instance). Instead of having a separate slab, this patch just appends the space that we need to the 'task_struct' which we dynamically allocate already. This saves from doing an extra slab allocation at fork(). The only real downside here is that we have to stick everything and the end of the task_struct. But, I think the BUILD_BUG_ON()s I stuck in there should keep that from being too fragile. Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave@sr71.net> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1437128892-9831-2-git-send-email-mingo@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-07-18fsnotify: fix oops in fsnotify_clear_marks_by_group_flags()Jan Kara1-20/+14
fsnotify_clear_marks_by_group_flags() can race with fsnotify_destroy_marks() so when fsnotify_destroy_mark_locked() drops mark_mutex, a mark from the list iterated by fsnotify_clear_marks_by_group_flags() can be freed and we dereference free memory in the loop there. Fix the problem by keeping mark_mutex held in fsnotify_destroy_mark_locked(). The reason why we drop that mutex is that we need to call a ->freeing_mark() callback which may acquire mark_mutex again. To avoid this and similar lock inversion issues, we move the call to ->freeing_mark() callback to the kthread destroying the mark. Signed-off-by: Jan Kara <jack@suse.cz> Reported-by: Ashish Sangwan <a.sangwan@samsung.com> Suggested-by: Lino Sanfilippo <LinoSanfilippo@gmx.de> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-07-18/proc/$PID/cmdline: fixup empty ARGV caseAlexey Dobriyan1-0/+5
/proc/*/cmdline code checks if it should look at ENVP area by checking last byte of ARGV area: rv = access_remote_vm(mm, arg_end - 1, &c, 1, 0); if (rv <= 0) goto out_free_page; If ARGV is somehow made empty (by doing execve(..., NULL, ...) or manually setting ->arg_start and ->arg_end to equal values), the decision will be based on byte which doesn't even belong to ARGV/ENVP. So, quickly check if ARGV area is empty and report 0 to match previous behaviour. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-07-18configfs: fix kernel infoleak through user-controlled format stringNicolas Iooss1-2/+2
Some modules call config_item_init_type_name() and config_group_init_type_name() with parameter "name" directly controlled by userspace. These two functions call config_item_set_name() with this name used as a format string, which can be used to leak information such as content of the stack to userspace. For example, make_netconsole_target() in netconsole module calls config_item_init_type_name() with the name of a newly-created directory. This means that the following commands give some unexpected output, with configfs mounted in /sys/kernel/config/ and on a system with a configured eth0 ethernet interface: # modprobe netconsole # mkdir /sys/kernel/config/netconsole/target_%lx # echo eth0 > /sys/kernel/config/netconsole/target_%lx/dev_name # echo 1 > /sys/kernel/config/netconsole/target_%lx/enabled # echo eth0 > /sys/kernel/config/netconsole/target_%lx/dev_name # dmesg |tail -n1 [ 142.697668] netconsole: target (target_ffffffffc0ae8080) is enabled, disable to update parameters The directory name is correct but %lx has been interpreted in the internal item name, displayed here in the error message used by store_dev_name() in drivers/net/netconsole.c. To fix this, update every caller of config_item_set_name to use "%s" when operating on untrusted input. This issue was found using -Wformat-security gcc flag, once a __printf attribute has been added to config_item_set_name(). Signed-off-by: Nicolas Iooss <nicolas.iooss_linux@m4x.org> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Acked-by: Felipe Balbi <balbi@ti.com> Acked-by: Joel Becker <jlbec@evilplan.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-07-18fs, proc: add help for CONFIG_PROC_CHILDRENIago López Galeiras1-0/+6
The purpose of the option was documented in Documentation/filesystems/proc.txt but the help text was missing. Add small help text that also points to the documentation. Signed-off-by: Iago López Galeiras <iago@endocode.com> Reviewed-by: Jean Delvare <jdelvare@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-07-17Merge tag 'jfs-4.2' of git://github.com/kleikamp/linux-shaggyLinus Torvalds3-17/+16
Pull jfs fixes from David Kleikamp: "A couple trivial fixes and an error path fix" * tag 'jfs-4.2' of git://github.com/kleikamp/linux-shaggy: jfs: clean up jfs_rename and fix out of order unlock jfs: fix indentation on if statement jfs: removed a prohibited space after opening parenthesis
2015-07-15Merge tag 'locks-v4.2-1' of git://git.samba.org/jlayton/linuxLinus Torvalds2-30/+26
Pull file locking updates from Jeff Layton: "I had thought that I was going to get away without a pull request this cycle. There was a NFSv4 file locking problem that cropped up that I tried to fix in the NFSv4 code alone, but that fix has turned out to be problematic. These patches fix this in the correct way. Note that this touches some NFSv4 code as well. Ordinarily I'd wait for Trond to ACK this, but he's on holiday right now and the bug is rather nasty. So I suggest we merge this and if he raises issues with it we can sort it out when he gets back" Acked-by: Bruce Fields <bfields@fieldses.org> Acked-by: Dan Williams <dan.j.williams@intel.com> [ +1 to this series fixing a 100% reproducible slab corruption + general protection fault in my nfs-root test environment. - Dan ] Acked-by: Anna Schumaker <Anna.Schumaker@Netapp.com> * tag 'locks-v4.2-1' of git://git.samba.org/jlayton/linux: locks: inline posix_lock_file_wait and flock_lock_file_wait nfs4: have do_vfs_lock take an inode pointer locks: new helpers - flock_lock_inode_wait and posix_lock_inode_wait locks: have flock_lock_file take an inode pointer instead of a filp Revert "nfs: take extra reference to fl->fl_file when running a LOCKU operation"
2015-07-15jfs: clean up jfs_rename and fix out of order unlockDave Kleikamp1-14/+13
The end of jfs_rename(), which is also used by the error paths, included a call to IWRITE_UNLOCK(new_ip) after labels out1, out2 and out3. If we come in through these labels, IWRITE_LOCK() has not been called yet. In moving that call to the correct spot, I also moved some exceptional truncate code earlier as well, since the early error paths don't need to deal with it, and I renamed out4: to out_tx: so a future patch by Jan Kara doesn't need to deal with renumbering or confusing out-of-order labels. Signed-off-by: Dave Kleikamp <dave.kleikamp@oracle.com>
2015-07-14Btrfs: fix file corruption after cloning inline extentsFilipe Manana1-0/+14
Using the clone ioctl (or extent_same ioctl, which calls the same extent cloning function as well) we end up allowing copy an inline extent from the source file into a non-zero offset of the destination file. This is something not expected and that the btrfs code is not prepared to deal with - all inline extents must be at a file offset equals to 0. For example, the following excerpt of a test case for fstests triggers a crash/BUG_ON() on a write operation after an inline extent is cloned into a non-zero offset: _scratch_mkfs >>$seqres.full 2>&1 _scratch_mount # Create our test files. File foo has the same 2K of data at offset 4K # as file bar has at its offset 0. $XFS_IO_PROG -f -s -c "pwrite -S 0xaa 0 4K" \ -c "pwrite -S 0xbb 4k 2K" \ -c "pwrite -S 0xcc 8K 4K" \ $SCRATCH_MNT/foo | _filter_xfs_io # File bar consists of a single inline extent (2K size). $XFS_IO_PROG -f -s -c "pwrite -S 0xbb 0 2K" \ $SCRATCH_MNT/bar | _filter_xfs_io # Now call the clone ioctl to clone the extent of file bar into file # foo at its offset 4K. This made file foo have an inline extent at # offset 4K, something which the btrfs code can not deal with in future # IO operations because all inline extents are supposed to start at an # offset of 0, resulting in all sorts of chaos. # So here we validate that clone ioctl returns an EOPNOTSUPP, which is # what it returns for other cases dealing with inlined extents. $CLONER_PROG -s 0 -d $((4 * 1024)) -l $((2 * 1024)) \ $SCRATCH_MNT/bar $SCRATCH_MNT/foo # Because of the inline extent at offset 4K, the following write made # the kernel crash with a BUG_ON(). $XFS_IO_PROG -c "pwrite -S 0xdd 6K 2K" $SCRATCH_MNT/foo | _filter_xfs_io status=0 exit The stack trace of the BUG_ON() triggered by the last write is: [152154.035903] ------------[ cut here ]------------ [152154.036424] kernel BUG at mm/page-writeback.c:2286! [152154.036424] invalid opcode: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC [152154.036424] Modules linked in: btrfs dm_flakey dm_mod crc32c_generic xor raid6_pq nfsd auth_rpcgss oid_registry nfs_acl nfs lockd grace fscache sunrpc loop fuse parport_pc acpi_cpu$ [152154.036424] CPU: 2 PID: 17873 Comm: xfs_io Tainted: G W 4.1.0-rc6-btrfs-next-11+ #2 [152154.036424] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.8.1-0-g4adadbd-20150316_085822-nilsson.home.kraxel.org 04/01/2014 [152154.036424] task: ffff880429f70990 ti: ffff880429efc000 task.ti: ffff880429efc000 [152154.036424] RIP: 0010:[<ffffffff8111a9d5>] [<ffffffff8111a9d5>] clear_page_dirty_for_io+0x1e/0x90 [152154.036424] RSP: 0018:ffff880429effc68 EFLAGS: 00010246 [152154.036424] RAX: 0200000000000806 RBX: ffffea0006a6d8f0 RCX: 0000000000000001 [152154.036424] RDX: 0000000000000000 RSI: ffffffff81155d1b RDI: ffffea0006a6d8f0 [152154.036424] RBP: ffff880429effc78 R08: ffff8801ce389fe0 R09: 0000000000000001 [152154.036424] R10: 0000000000002000 R11: ffffffffffffffff R12: ffff8800200dce68 [152154.036424] R13: 0000000000000000 R14: ffff8800200dcc88 R15: ffff8803d5736d80 [152154.036424] FS: 00007fbf119f6700(0000) GS:ffff88043d280000(0000) knlGS:0000000000000000 [152154.036424] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [152154.036424] CR2: 0000000001bdc000 CR3: 00000003aa555000 CR4: 00000000000006e0 [152154.036424] Stack: [152154.036424] ffff8803d5736d80 0000000000000001 ffff880429effcd8 ffffffffa04e97c1 [152154.036424] ffff880429effd68 ffff880429effd60 0000000000000001 ffff8800200dc9c8 [152154.036424] 0000000000000001 ffff8800200dcc88 0000000000000000 0000000000001000 [152154.036424] Call Trace: [152154.036424] [<ffffffffa04e97c1>] lock_and_cleanup_extent_if_need+0x147/0x18d [btrfs] [152154.036424] [<ffffffffa04ea82c>] __btrfs_buffered_write+0x245/0x4c8 [btrfs] [152154.036424] [<ffffffffa04ed14b>] ? btrfs_file_write_iter+0x150/0x3e0 [btrfs] [152154.036424] [<ffffffffa04ed15a>] ? btrfs_file_write_iter+0x15f/0x3e0 [btrfs] [152154.036424] [<ffffffffa04ed2c7>] btrfs_file_write_iter+0x2cc/0x3e0 [btrfs] [152154.036424] [<ffffffff81165a4a>] __vfs_write+0x7c/0xa5 [152154.036424] [<ffffffff81165f89>] vfs_write+0xa0/0xe4 [152154.036424] [<ffffffff81166855>] SyS_pwrite64+0x64/0x82 [152154.036424] [<ffffffff81465197>] system_call_fastpath+0x12/0x6f [152154.036424] Code: 48 89 c7 e8 0f ff ff ff 5b 41 5c 5d c3 0f 1f 44 00 00 55 48 89 e5 41 54 53 48 89 fb e8 ae ef 00 00 49 89 c4 48 8b 03 a8 01 75 02 <0f> 0b 4d 85 e4 74 59 49 8b 3c 2$ [152154.036424] RIP [<ffffffff8111a9d5>] clear_page_dirty_for_io+0x1e/0x90 [152154.036424] RSP <ffff880429effc68> [152154.242621] ---[ end trace e3d3376b23a57041 ]--- Fix this by returning the error EOPNOTSUPP if an attempt to copy an inline extent into a non-zero offset happens, just like what is done for other scenarios that would require copying/splitting inline extents, which were introduced by the following commits: 00fdf13a2e9f ("Btrfs: fix a crash of clone with inline extents's split") 3f9e3df8da3c ("btrfs: replace error code from btrfs_drop_extents") Cc: stable@vger.kernel.org Signed-off-by: Filipe Manana <fdmanana@suse.com>
2015-07-13locks: inline posix_lock_file_wait and flock_lock_file_waitJeff Layton1-28/+0
They just call file_inode and then the corresponding *_inode_file_wait function. Just make them static inlines instead. Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>
2015-07-13nfs4: have do_vfs_lock take an inode pointerJeff Layton1-8/+8
Now that we have file locking helpers that can deal with an inode instead of a filp, we can change the NFSv4 locking code to use that instead. This should fix the case where we have a filp that is closed while flock or OFD locks are set on it, and the task is signaled so that it doesn't wait for the LOCKU reply to come in before the filp is freed. At that point we can end up with a use-after-free with the current code, which relies on dereferencing the fl_file in the lock request. Signed-off-by: Jeff Layton <jeff.layton@primarydata.com> Reviewed-by: "J. Bruce Fields" <bfields@fieldses.org> Tested-by: "J. Bruce Fields" <bfields@fieldses.org>