summaryrefslogtreecommitdiff
path: root/fs
AgeCommit message (Collapse)AuthorFilesLines
2011-06-21Merge branch 'for-2.6.40' of git://linux-nfs.org/~bfields/linuxLinus Torvalds3-20/+19
* 'for-2.6.40' of git://linux-nfs.org/~bfields/linux: nfsd4: fix break_lease flags on nfsd open nfsd: link returns nfserr_delay when breaking lease nfsd: v4 support requires CRYPTO nfsd: fix dependency of nfsd on auth_rpcgss
2011-06-21Merge branch 'for-linus' of ↵Linus Torvalds10-36/+5
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: devcgroup_inode_permission: take "is it a device node" checks to inlined wrapper fix comment in generic_permission() kill obsolete comment for follow_down() proc_sys_permission() is OK in RCU mode reiserfs_permission() doesn't need to bail out in RCU mode proc_fd_permission() is doesn't need to bail out in RCU mode nilfs2_permission() doesn't need to bail out in RCU mode logfs doesn't need ->permission() at all coda_ioctl_permission() is safe in RCU mode cifs_permission() doesn't need to bail out in RCU mode bad_inode_permission() is safe from RCU mode ubifs: dereferencing an ERR_PTR in ubifs_mount()
2011-06-20Merge branch 'for-linus' of ↵Linus Torvalds11-189/+174
git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable: Btrfs: avoid delayed metadata items during commits btrfs: fix uninitialized return value btrfs: fix wrong reservation when doing delayed inode operations btrfs: Remove unused sysfs code btrfs: fix dereference of ERR_PTR value Btrfs: fix relocation races Btrfs: set no_trans_join after trying to expand the transaction Btrfs: protect the pending_snapshots list with trans_lock Btrfs: fix path leakage on subvol deletion Btrfs: drop the delalloc_bytes check in shrink_delalloc Btrfs: check the return value from set_anon_super
2011-06-20fix comment in generic_permission()Al Viro1-1/+2
CAP_DAC_OVERRIDE is enough for MAY_EXEC on directory, even if no exec bits are set. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-06-20kill obsolete comment for follow_down()Al Viro1-3/+0
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-06-20proc_sys_permission() is OK in RCU modeAl Viro1-3/+0
nothing blocking there, since all instances of sysctl ->permissions() method are non-blocking - both of them, that is. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-06-20reiserfs_permission() doesn't need to bail out in RCU modeAl Viro1-2/+0
nothing blocking other than generic_permission() (and check_acl callback does bail out in RCU mode). Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-06-20proc_fd_permission() is doesn't need to bail out in RCU modeAl Viro1-5/+1
nothing blocking except generic_permission() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-06-20nilfs2_permission() doesn't need to bail out in RCU modeAl Viro1-6/+1
Nothing blocking except for generic_permission(). Which will DTRT. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-06-20logfs doesn't need ->permission() at allAl Viro1-8/+0
... and never did, what with its ->permission() being what we do by default when ->permission is NULL... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-06-20coda_ioctl_permission() is safe in RCU modeAl Viro1-2/+0
return (mask & MAY_EXEC) ? -EACCES : 0; is non-blocking... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-06-20cifs_permission() doesn't need to bail out in RCU modeAl Viro1-3/+0
nothing potentially blocking except generic_permission(), which will DTRT Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-06-20bad_inode_permission() is safe from RCU modeAl Viro1-3/+0
return -EIO; is *not* a blocking operation, thank you very much. Nick, what the hell have you been smoking? Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-06-20ubifs: dereferencing an ERR_PTR in ubifs_mount()Dan Carpenter1-0/+1
d251ed271d5 "ubifs: fix sget races" left out the goto from this error path so the static checkers complain that we're dereferencing "sb" when it's an ERR_PTR. Signed-off-by: Dan Carpenter <error27@gmail.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-06-20nfsd4: fix break_lease flags on nfsd openJ. Bruce Fields1-6/+9
Thanks to Casey Bodley for pointing out that on a read open we pass 0, instead of O_RDONLY, to break_lease, with the result that a read open is treated like a write open for the purposes of lease breaking! Reported-by: Casey Bodley <cbodley@citi.umich.edu> Cc: stable@kernel.org Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-06-19Merge branches 'perf-urgent-for-linus', 'sched-urgent-for-linus', ↵Linus Torvalds1-1/+4
'timers-urgent-for-linus' and 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: tools/perf: Fix static build of perf tool tracing: Fix regression in printk_formats file * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: generic-ipi: Fix kexec boot crash by initializing call_single_queue before enabling interrupts * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: clocksource: Make watchdog robust vs. interruption timerfd: Fix wakeup of processes when timer is cancelled on clock change * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, MAINTAINERS: Add x86 MCE people x86, efi: Do not reserve boot services regions within reserved areas
2011-06-18isofs: fix bh leak in isofs_fill_super() error caseLinus Torvalds1-1/+2
In isofs_fill_super(), when an iso_primary_descriptor is found, it is kept in pri_bh. The error cases don't properly release it. Fix it. Reported-and-tested-by: 김원석 <stanley.will.kim@gmail.com> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-06-18Btrfs: avoid delayed metadata items during commitsChris Mason3-10/+28
Snapshot creation has two phases. One is the initial snapshot setup, and the second is done during commit, while nobody is allowed to modify the root we are snapshotting. The delayed metadata insertion code can break that rule, it does a delayed inode update on the inode of the parent of the snapshot, and delayed directory item insertion. This makes sure to run the pending delayed operations before we record the snapshot root, which avoids corruptions. Signed-off-by: Chris Mason <chris.mason@oracle.com>
2011-06-17btrfs: fix uninitialized return valueDavid Sterba1-3/+3
When allocation fails in btrfs_read_fs_root_no_name, ret is not set although it is returned, holding a garbage value. Signed-off-by: David Sterba <dsterba@suse.cz> Reviewed-by: Li Zefan <lizf@cn.fujitsu.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
2011-06-17btrfs: fix wrong reservation when doing delayed inode operationsMiao Xie2-6/+20
We have migrated the space for the delayed inode items from trans_block_rsv to global_block_rsv, but we forgot to set trans->block_rsv to global_block_rsv when we doing delayed inode operations, and the following Oops happened: [ 9792.654889] ------------[ cut here ]------------ [ 9792.654898] WARNING: at fs/btrfs/extent-tree.c:5681 btrfs_alloc_free_block+0xca/0x27c [btrfs]() [ 9792.654899] Hardware name: To Be Filled By O.E.M. [ 9792.654900] Modules linked in: btrfs zlib_deflate libcrc32c ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_filter ip6_tables arc4 rt61pci rt2x00pci rt2x00lib snd_hda_codec_hdmi mac80211 snd_hda_codec_realtek cfg80211 snd_hda_intel edac_core snd_seq rfkill pcspkr serio_raw snd_hda_codec eeprom_93cx6 edac_mce_amd sp5100_tco i2c_piix4 k10temp snd_hwdep snd_seq_device snd_pcm floppy r8169 xhci_hcd mii snd_timer snd soundcore snd_page_alloc ipv6 firewire_ohci pata_acpi ata_generic firewire_core pata_via crc_itu_t radeon ttm drm_kms_helper drm i2c_algo_bit i2c_core [last unloaded: scsi_wait_scan] [ 9792.654919] Pid: 2762, comm: rm Tainted: G W 2.6.39+ #1 [ 9792.654920] Call Trace: [ 9792.654922] [<ffffffff81053c4a>] warn_slowpath_common+0x83/0x9b [ 9792.654925] [<ffffffff81053c7c>] warn_slowpath_null+0x1a/0x1c [ 9792.654933] [<ffffffffa038e747>] btrfs_alloc_free_block+0xca/0x27c [btrfs] [ 9792.654945] [<ffffffffa03b8562>] ? map_extent_buffer+0x6e/0xa8 [btrfs] [ 9792.654953] [<ffffffffa038189b>] __btrfs_cow_block+0xfc/0x30c [btrfs] [ 9792.654963] [<ffffffffa0396aa6>] ? btrfs_buffer_uptodate+0x47/0x58 [btrfs] [ 9792.654970] [<ffffffffa0382e48>] ? read_block_for_search+0x94/0x368 [btrfs] [ 9792.654978] [<ffffffffa0381ba9>] btrfs_cow_block+0xfe/0x146 [btrfs] [ 9792.654986] [<ffffffffa03848b0>] btrfs_search_slot+0x14d/0x4b6 [btrfs] [ 9792.654997] [<ffffffffa03b8562>] ? map_extent_buffer+0x6e/0xa8 [btrfs] [ 9792.655022] [<ffffffffa03938e8>] btrfs_lookup_inode+0x2f/0x8f [btrfs] [ 9792.655025] [<ffffffff8147afac>] ? _cond_resched+0xe/0x22 [ 9792.655027] [<ffffffff8147b892>] ? mutex_lock+0x29/0x50 [ 9792.655039] [<ffffffffa03d41b1>] btrfs_update_delayed_inode+0x72/0x137 [btrfs] [ 9792.655051] [<ffffffffa03d4ea2>] btrfs_run_delayed_items+0x90/0xdb [btrfs] [ 9792.655062] [<ffffffffa039a69b>] btrfs_commit_transaction+0x228/0x654 [btrfs] [ 9792.655064] [<ffffffff8106e8da>] ? remove_wait_queue+0x3a/0x3a [ 9792.655075] [<ffffffffa03a2fa5>] btrfs_evict_inode+0x14d/0x202 [btrfs] [ 9792.655077] [<ffffffff81132bd6>] evict+0x71/0x111 [ 9792.655079] [<ffffffff81132de0>] iput+0x12a/0x132 [ 9792.655081] [<ffffffff8112aa3a>] do_unlinkat+0x106/0x155 [ 9792.655083] [<ffffffff81127b83>] ? path_put+0x1f/0x23 [ 9792.655085] [<ffffffff8109c53c>] ? audit_syscall_entry+0x145/0x171 [ 9792.655087] [<ffffffff81128410>] ? putname+0x34/0x36 [ 9792.655090] [<ffffffff8112b441>] sys_unlinkat+0x29/0x2b [ 9792.655092] [<ffffffff81482c42>] system_call_fastpath+0x16/0x1b [ 9792.655093] ---[ end trace 02b696eb02b3f768 ]--- This patch fix it by setting the reservation of the transaction handle to the correct one. Reported-by: Josef Bacik <josef@redhat.com> Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
2011-06-17btrfs: Remove unused sysfs codeMaarten Lankhorst3-148/+0
Removes code no longer used. The sysfs file itself is kept, because the btrfs developers expressed interest in putting new entries to sysfs. Signed-off-by: Maarten Lankhorst <m.b.lankhorst@gmail.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
2011-06-17btrfs: fix dereference of ERR_PTR valueDavid Sterba1-1/+1
smatch reports: btrfs_recover_log_trees error: 'wc.replay_dest' dereferencing possible ERR_PTR() Signed-off-by: David Sterba <dsterba@suse.cz> Signed-off-by: Chris Mason <chris.mason@oracle.com>
2011-06-17Merge branch 'for-chris' of ↵Chris Mason3-3/+14
git://git.kernel.org/pub/scm/linux/kernel/git/josef/btrfs-work into for-linus Conflicts: fs/btrfs/transaction.c Signed-off-by: Chris Mason <chris.mason@oracle.com>
2011-06-17Merge branch 'for-linus' of git://oss.sgi.com/xfs/xfsLinus Torvalds4-96/+42
* 'for-linus' of git://oss.sgi.com/xfs/xfs: xfs: make log devices with write back caches work xfs: fix ->mknod() return value on xfs_get_acl() failure
2011-06-17Btrfs: fix relocation racesChris Mason4-13/+105
The recent commit to get rid of our trans_mutex introduced some races with block group relocation. The problem is that relocation needs to do some record keeping about each root, and it was relying on the transaction mutex to coordinate things in subtle ways. This fix adds a mutex just for the relocation code and makes sure it doesn't have a big impact on normal operations. The race is really fixed in btrfs_record_root_in_trans, which is where we step back and wait for the relocation code to finish accounting setup. Signed-off-by: Chris Mason <chris.mason@oracle.com>
2011-06-17KEYS/DNS: Fix ____call_usermodehelper() to not lose the session keyringDavid Howells1-1/+1
____call_usermodehelper() now erases any credentials set by the subprocess_inf::init() function. The problem is that commit 17f60a7da150 ("capabilites: allow the application of capability limits to usermode helpers") creates and commits new credentials with prepare_kernel_cred() after the call to the init() function. This wipes all keyrings after umh_keys_init() is called. The best way to deal with this is to put the init() call just prior to the commit_creds() call, and pass the cred pointer to init(). That means that umh_keys_init() and suchlike can modify the credentials _before_ they are published and potentially in use by the rest of the system. This prevents request_key() from working as it is prevented from passing the session keyring it set up with the authorisation token to /sbin/request-key, and so the latter can't assume the authority to instantiate the key. This causes the in-kernel DNS resolver to fail with ENOKEY unconditionally. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Eric Paris <eparis@redhat.com> Tested-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-06-17Merge git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/linux-2.6-nsfdLinus Torvalds1-3/+6
* git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/linux-2.6-nsfd: proc: Fix Oops on stat of /proc/<zombie pid>/ns/net
2011-06-16Merge branch 'for-linus' of ↵Linus Torvalds11-168/+167
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: AFS: Use i_generation not i_version for the vnode uniquifier AFS: Set s_id in the superblock to the volume name vfs: Fix data corruption after failed write in __block_write_begin() afs: afs_fill_page reads too much, or wrong data VFS: Fix vfsmount overput on simultaneous automount fix wrong iput on d_inode introduced by e6bc45d65d Delay struct net freeing while there's a sysfs instance refering to it afs: fix sget() races, close leak on umount ubifs: fix sget races ubifs: split allocation of ubifs_info into a separate function fix leak in proc_set_super()
2011-06-16xfs: make log devices with write back caches workChristoph Hellwig3-95/+41
There's no reason not to support cache flushing on external log devices. The only thing this really requires is flushing the data device first both in fsync and log commits. A side effect is that we also have to remove the barrier write test during mount, which has been superflous since the new FLUSH+FUA code anyway. Also use the chance to flush the RT subvolume write cache before the fsync commit, which is required for correct semantics. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Alex Elder <aelder@sgi.com>
2011-06-16AFS: Use i_generation not i_version for the vnode uniquifierDavid Howells3-10/+11
Store the AFS vnode uniquifier in the i_generation field, not the i_version field of the inode struct. i_version can then be given the AFS data version number. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-06-16AFS: Set s_id in the superblock to the volume nameDavid Howells1-0/+1
Set s_id in the superblock to the name of the AFS volume that this superblock corresponds to. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-06-16vfs: Fix data corruption after failed write in __block_write_begin()Jan Kara1-3/+1
I've got a report of a file corruption from fsxlinux on ext3. The important operations to the page were: mapwrite to a hole partial write to the page read - found the page zeroed from the end of the normal write The culprit seems to be that if get_block() fails in __block_write_begin() (e.g. transient ENOSPC in ext3), the function does ClearPageUptodate(page). Thus when we retry the write, the logic in __block_write_begin() thinks zeroing of the page is needed and overwrites old data. In fact, I don't see why we should ever need to zero the uptodate bit here - either the page was uptodate when we entered __block_write_begin() and it should stay so when we leave it, or it was not uptodate and noone had right to set it uptodate during __block_write_begin() so it remains !uptodate when we leave as well. So just remove clearing of the bit. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-06-16afs: afs_fill_page reads too much, or wrong dataAnton Blanchard1-12/+9
afs_fill_page should read the page that is about to be written but the current implementation has a number of issues. If we aren't extending the file we always read PAGE_CACHE_SIZE at offset 0. If we are extending the file we try to read the entire file. Change afs_fill_page to read PAGE_CACHE_SIZE at the right offset, clamped to i_size. While here, avoid calling afs_fill_page when we are doing a PAGE_CACHE_SIZE write. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-06-16VFS: Fix vfsmount overput on simultaneous automountAl Viro1-8/+16
[Kudos to dhowells for tracking that crap down] If two processes attempt to cause automounting on the same mountpoint at the same time, the vfsmount holding the mountpoint will be left with one too few references on it, causing a BUG when the kernel tries to clean up. The problem is that lock_mount() drops the caller's reference to the mountpoint's vfsmount in the case where it finds something already mounted on the mountpoint as it transits to the mounted filesystem and replaces path->mnt with the new mountpoint vfsmount. During a pathwalk, however, we don't take a reference on the vfsmount if it is the same as the one in the nameidata struct, but do_add_mount() doesn't know this. The fix is to make sure we have a ref on the vfsmount of the mountpoint before calling do_add_mount(). However, if lock_mount() doesn't transit, we're then left with an extra ref on the mountpoint vfsmount which needs releasing. We can handle that in follow_managed() by not making assumptions about what we can and what we cannot get from lookup_mnt() as the current code does. The callers of follow_managed() expect that reference to path->mnt will be grabbed iff path->mnt has been changed. follow_managed() and follow_automount() keep track of whether such reference has been grabbed and assume that it'll happen in those and only those cases that'll have us return with changed path->mnt. That assumption is almost correct - it breaks in case of racing automounts and in even harder to hit race between following a mountpoint and a couple of mount --move. The thing is, we don't need to make that assumption at all - after the end of loop in follow_manage() we can check if path->mnt has ended up unchanged and do mntput() if needed. The BUG can be reproduced with the following test program: #include <stdio.h> #include <sys/types.h> #include <sys/stat.h> #include <unistd.h> #include <sys/wait.h> int main(int argc, char **argv) { int pid, ws; struct stat buf; pid = fork(); stat(argv[1], &buf); if (pid > 0) wait(&ws); return 0; } and the following procedure: (1) Mount an NFS volume that on the server has something else mounted on a subdirectory. For instance, I can mount / from my server: mount warthog:/ /mnt -t nfs4 -r On the server /data has another filesystem mounted on it, so NFS will see a change in FSID as it walks down the path, and will mark /mnt/data as being a mountpoint. This will cause the automount code to be triggered. !!! Do not look inside the mounted fs at this point !!! (2) Run the above program on a file within the submount to generate two simultaneous automount requests: /tmp/forkstat /mnt/data/testfile (3) Unmount the automounted submount: umount /mnt/data (4) Unmount the original mount: umount /mnt At this point the kernel should throw a BUG with something like the following: BUG: Dentry ffff880032e3c5c0{i=2,n=} still in use (1) [unmount of nfs4 0:12] Note that the bug appears on the root dentry of the original mount, not the mountpoint and not the submount because sys_umount() hasn't got to its final mntput_no_expire() yet, but this isn't so obvious from the call trace: [<ffffffff8117cd82>] shrink_dcache_for_umount+0x69/0x82 [<ffffffff8116160e>] generic_shutdown_super+0x37/0x15b [<ffffffffa00fae56>] ? nfs_super_return_all_delegations+0x2e/0x1b1 [nfs] [<ffffffff811617f3>] kill_anon_super+0x1d/0x7e [<ffffffffa00d0be1>] nfs4_kill_super+0x60/0xb6 [nfs] [<ffffffff81161c17>] deactivate_locked_super+0x34/0x83 [<ffffffff811629ff>] deactivate_super+0x6f/0x7b [<ffffffff81186261>] mntput_no_expire+0x18d/0x199 [<ffffffff811862a8>] mntput+0x3b/0x44 [<ffffffff81186d87>] release_mounts+0xa2/0xbf [<ffffffff811876af>] sys_umount+0x47a/0x4ba [<ffffffff8109e1ca>] ? trace_hardirqs_on_caller+0x1fd/0x22f [<ffffffff816ea86b>] system_call_fastpath+0x16/0x1b as do_umount() is inlined. However, you can see release_mounts() in there. Note also that it may be necessary to have multiple CPU cores to be able to trigger this bug. Tested-by: Jeff Layton <jlayton@redhat.com> Tested-by: Ian Kent <raven@themaw.net> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-06-16fix wrong iput on d_inode introduced by e6bc45d65dTörök Edwin1-1/+3
Git bisection shows that commit e6bc45d65df8599fdbae73be9cec4ceed274db53 causes BUG_ONs under high I/O load: kernel BUG at fs/inode.c:1368! [ 2862.501007] Call Trace: [ 2862.501007] [<ffffffff811691d8>] d_kill+0xf8/0x140 [ 2862.501007] [<ffffffff81169c19>] dput+0xc9/0x190 [ 2862.501007] [<ffffffff8115577f>] fput+0x15f/0x210 [ 2862.501007] [<ffffffff81152171>] filp_close+0x61/0x90 [ 2862.501007] [<ffffffff81152251>] sys_close+0xb1/0x110 [ 2862.501007] [<ffffffff814c14fb>] system_call_fastpath+0x16/0x1b A reliable way to reproduce this bug is: Login to KDE, run 'rsnapshot sync', and apt-get install openjdk-6-jdk, and apt-get remove openjdk-6-jdk. The buggy part of the patch is this: struct inode *inode = NULL; ..... - if (nd.last.name[nd.last.len]) - goto slashes; inode = dentry->d_inode; - if (inode) - ihold(inode); + if (nd.last.name[nd.last.len] || !inode) + goto slashes; + ihold(inode) ... if (inode) iput(inode); /* truncate the inode here */ If nd.last.name[nd.last.len] is nonzero (and thus goto slashes branch is taken), and dentry->d_inode is non-NULL, then this code now does an additional iput on the inode, which is wrong. Fix this by only setting the inode variable if nd.last.name[nd.last.len] is 0. Reference: https://lkml.org/lkml/2011/6/15/50 Reported-by: Norbert Preining <preining@logic.at> Reported-by: Török Edwin <edwintorok@gmail.com> Cc: "Theodore Ts'o" <tytso@mit.edu> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Török Edwin <edwintorok@gmail.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-06-16Revert "fs/exec.c: use BUILD_BUG_ON for VM_STACK_FLAGS & ↵Linus Torvalds1-1/+1
VM_STACK_INCOMPLETE_SETUP" This reverts commit 7f81c8890c15a10f5220bebae3b6dfae4961962a. It turns out that it's not actually a build-time check on x86-64 UML, which does some seriously crazy stuff with VM_STACK_FLAGS. The VM_STACK_FLAGS define depends on the arch-supplied VM_STACK_DEFAULT_FLAGS value, and on x86-64 UML we have arch/um/sys-x86_64/shared/sysdep/vm-flags.h: #define VM_STACK_DEFAULT_FLAGS \ (test_thread_flag(TIF_IA32) ? vm_stack_flags32 : vm_stack_flags) #define VM_STACK_DEFAULT_FLAGS vm_stack_flags (yes, seriously: two different #define's for that thing, with the first one being inside an "#ifdef TIF_IA32") It's possible that it is UML that should just be fixed in this area, but for now let's just undo the (very small) optimization. Reported-by: Randy Dunlap <randy.dunlap@oracle.com> Acked-by: Andrew Morton <akpm@linux-foundation.org> Cc: Michal Hocko <mhocko@suse.cz> Cc: Richard Weinberger <richard@nod.at> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-06-16fs/exec.c: use BUILD_BUG_ON for VM_STACK_FLAGS & VM_STACK_INCOMPLETE_SETUPMichal Hocko1-1/+1
Commit a8bef8ff6ea1 ("mm: migration: avoid race between shift_arg_pages() and rmap_walk() during migration by not migrating temporary stacks") introduced a BUG_ON() to ensure that VM_STACK_FLAGS and VM_STACK_INCOMPLETE_SETUP do not overlap. The check is a compile time one, so BUILD_BUG_ON is more appropriate. Signed-off-by: Michal Hocko <mhocko@suse.cz> Cc: Mel Gorman <mel@csn.ul.ie> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-06-16proc: Fix Oops on stat of /proc/<zombie pid>/ns/netEric W. Biederman1-3/+6
Don't call iput with the inode half setup to be a namespace filedescriptor. Instead rearrange the code so that we don't initialize ei->ns_ops until after I ns_ops->get succeeds, preventing us from invoking ns_ops->put when ns_ops->get failed. Reported-by: Ingo Saitz <Ingo.Saitz@stud.uni-hannover.de> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2011-06-15Btrfs: set no_trans_join after trying to expand the transactionJosef Bacik1-3/+11
We can lockup if we try to allow new writers join the transaction and we have flushoncommit set or have a pending snapshot. This is because we set no_trans_join and then loop around and try to wait for ordered extents again. The problem is the ordered endio stuff needs to join the transaction, which it can't do because no_trans_join is set. So instead wait until after this loop to set no_trans_join and then make sure to wait for num_writers == 1 in case anybody got started in between us exiting the loop and setting no_trans_join. This could easily be reproduced by mounting -o flushoncommit and running xfstest 13. It cannot be reproduced with this patch. Thanks, Reported-by: Jim Schutt <jaschut@sandia.gov> Signed-off-by: Josef Bacik <josef@redhat.com>
2011-06-15Btrfs: protect the pending_snapshots list with trans_lockJosef Bacik1-0/+2
Currently there is nothing protecting the pending_snapshots list on the transaction. We only hold the directory mutex that we are snapshotting and a read lock on the subvol_sem, so we could race with somebody else creating a snapshot in a different directory and end up with list corruption. So protect this list with the trans_lock. Thanks, Signed-off-by: Josef Bacik <josef@redhat.com>
2011-06-15Btrfs: fix path leakage on subvol deletionJosef Bacik1-0/+1
The delayed ref patch accidently removed the btrfs_free_path in btrfs_unlink_subvol, this puts it back and means we don't leak a path. Thanks, Signed-off-by: Josef Bacik <josef@redhat.com>
2011-06-14[CIFS] update cifs version to 1.73Steve French1-1/+1
Signed-off-by: Steve French <sfrench@us.ibm.com>
2011-06-14xfs: fix ->mknod() return value on xfs_get_acl() failureAl Viro1-1/+1
->mknod() should return negative on errors and PTR_ERR() gives already negative value... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Alex Elder <aelder@sgi.com>
2011-06-14[CIFS] trivial cleanup fscache cFYI and cERROR messagesSteve French2-30/+27
... for uniformity and cleaner debug logs. Signed-off-by: Suresh Jayaraman <sjayaraman@suse.de> Signed-off-by: Steve French <sfrench@us.ibm.com>
2011-06-14timerfd: Fix wakeup of processes when timer is cancelled on clock changeMax Asbock1-1/+4
Currently processes waiting with poll on cancelable timerfd timers are not woken up when the timers are canceled. When the system time is set the clock_was_set() function calls timerfd_clock_was_set() to cancel and wake up processes waiting on potential cancelable timerfd timers. However the wake up currently has no effect because in the case of timerfd_read it is dependent on ctx->ticks not being 0. timerfd_poll also requires ctx->ticks being non zero. As a consequence processes waiting on cancelable timers only get woken up when the timers expire. This patch fixes this by incrementing ctx->ticks before calling wake_up. Signed-off-by: Max Asbock <masbock@linux.vnet.ibm.com> Cc: kay.sievers@vrfy.org Cc: virtuoso@slind.org Cc: johnstul <johnstul@linux.vnet.ibm.com> Link: http://lkml.kernel.org/r/1307985512.4710.41.camel@w-amax.beaverton.ibm.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2011-06-14cifs: correctly handle NULL tcon pointer in CIFSTConJeff Layton1-1/+5
Long ago (in commit 00e485b0), I added some code to handle share-level passwords in CIFSTCon. That code ignored the fact that it's legit to pass in a NULL tcon pointer when connecting to the IPC$ share on the server. This wasn't really a problem until recently as we only called CIFSTCon this way when the server returned -EREMOTE. With the introduction of commit c1508ca2 however, it gets called this way on every mount, causing an oops when share-level security is in effect. Fix this by simply treating a NULL tcon pointer as if user-level security were in effect. I'm not aware of any servers that protect the IPC$ share with a specific password anyway. Also, add a comment to the top of CIFSTCon to ensure that we don't make the same mistake again. Cc: <stable@kernel.org> Reported-by: Martijn Uffing <mp3project@sarijopen.student.utwente.nl> Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
2011-06-14cifs: show sec= option in /proc/mountsJeff Layton1-0/+33
Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
2011-06-14cifs: don't allow cifs_reconnect to exit with NULL socket pointerJeff Layton1-3/+3
It's possible for the following set of events to happen: cifsd calls cifs_reconnect which reconnects the socket. A userspace process then calls cifs_negotiate_protocol to handle the NEGOTIATE and gets a reply. But, while processing the reply, cifsd calls cifs_reconnect again. Eventually the GlobalMid_Lock is dropped and the reply from the earlier NEGOTIATE completes and the tcpStatus is set to CifsGood. cifs_reconnect then goes through and closes the socket and sets the pointer to zero, but because the status is now CifsGood, the new socket is not created and cifs_reconnect exits with the socket pointer set to NULL. Fix this by only setting the tcpStatus to CifsGood if the tcpStatus is CifsNeedNegotiate, and by making sure that generic_ip_connect is always called at least once in cifs_reconnect. Note that this is not a perfect fix for this issue. It's still possible that the NEGOTIATE reply is handled after the socket has been closed and reconnected. In that case, the socket state will look correct but it no NEGOTIATE was performed on it be for the wrong socket. In that situation though the server should just shut down the socket on the next attempted send, rather than causing the oops that occurs today. Cc: <stable@kernel.org> # .38.x: fd88ce9: [CIFS] cifs: clarify the meaning of tcpStatus == CifsGood Reported-and-Tested-by: Ben Greear <greearb@candelatech.com> Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
2011-06-14CIFS: Fix sparse errorPavel Shilovsky1-7/+4
cifs_sb_master_tlink was declared as inline, but without a definition. Remove the declaration and move the definition up. Signed-off-by: Pavel Shilovsky <piastryyy@gmail.com> Reviewed-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
2011-06-13Merge branch 'for-linus' of ↵Linus Torvalds10-53/+70
git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: ceph: unwind canceled flock state ceph: fix ENOENT logic in striped_read ceph: fix short sync reads from the OSD ceph: fix sync vs canceled write ceph: use ihold when we already have an inode ref