summaryrefslogtreecommitdiff
path: root/fs
AgeCommit message (Collapse)AuthorFilesLines
2013-01-04pstore: remove __dev* attributes.Greg Kroah-Hartman2-13/+10
CONFIG_HOTPLUG is going away as an option. As a result, the __dev* markings need to be removed. This change removes the use of __devinit from the pstore filesystem. Based on patches originally written by Bill Pemberton, but redone by me in order to handle some of the coding style issues better, by hand. Cc: Bill Pemberton <wfp5p@virginia.edu> Cc: Anton Vorontsov <cbouatmailru@gmail.com> Cc: Colin Cross <ccross@android.com> Cc: Kees Cook <keescook@chromium.org> Cc: Tony Luck <tony.luck@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-01-03Merge tag 'ecryptfs-3.8-rc2-fixes' of ↵Linus Torvalds3-6/+14
git://git.kernel.org/pub/scm/linux/kernel/git/tyhicks/ecryptfs Pull ecryptfs fixes from Tyler Hicks: "Two self-explanatory fixes and a third patch which improves performance: when overwriting a full page in the eCryptfs page cache, skip reading in and decrypting the corresponding lower page." * tag 'ecryptfs-3.8-rc2-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tyhicks/ecryptfs: fs/ecryptfs/crypto.c: make ecryptfs_encode_for_filename() static eCryptfs: fix to use list_for_each_entry_safe() when delete items eCryptfs: Avoid unnecessary disk read and data decryption during writing
2013-01-02Merge tag 'ext4_for_linus' of ↵Linus Torvalds7-56/+138
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull ext4 bug fixes from Ted Ts'o: "Various bug fixes for ext4. Perhaps the most serious bug fixed is one which could cause file system corruptions when performing file punch operations." * tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: ext4: avoid hang when mounting non-journal filesystems with orphan list ext4: lock i_mutex when truncating orphan inodes ext4: do not try to write superblock on ro remount w/o journal ext4: include journal blocks in df overhead calcs ext4: remove unaligned AIO warning printk ext4: fix an incorrect comment about i_mutex ext4: fix deadlock in journal_unmap_buffer() ext4: split off ext4_journalled_invalidatepage() jbd2: fix assertion failure in jbd2_journal_flush() ext4: check dioread_nolock on remount ext4: fix extent tree corruption caused by hole punch
2013-01-02mempolicy: remove arg from mpol_parse_str, mpol_to_strHugh Dickins1-1/+1
Remove the unused argument (formerly no_context) from mpol_parse_str() and from mpol_to_str(). Signed-off-by: Hugh Dickins <hughd@google.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-01-02epoll: prevent missed events on EPOLL_CTL_MODEric Wong1-1/+21
EPOLL_CTL_MOD sets the interest mask before calling f_op->poll() to ensure events are not missed. Since the modifications to the interest mask are not protected by the same lock as ep_poll_callback, we need to ensure the change is visible to other CPUs calling ep_poll_callback. We also need to ensure f_op->poll() has an up-to-date view of past events which occured before we modified the interest mask. So this barrier also pairs with the barrier in wq_has_sleeper(). This should guarantee either ep_poll_callback or f_op->poll() (or both) will notice the readiness of a recently-ready/modified item. This issue was encountered by Andreas Voellmy and Junchang(Jason) Wang in: http://thread.gmane.org/gmane.linux.kernel/1408782/ Signed-off-by: Eric Wong <normalperson@yhbt.net> Cc: Hans Verkuil <hans.verkuil@cisco.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Davide Libenzi <davidel@xmailserver.org> Cc: Hans de Goede <hdegoede@redhat.com> Cc: Mauro Carvalho Chehab <mchehab@infradead.org> Cc: David Miller <davem@davemloft.net> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andreas Voellmy <andreas.voellmy@yale.edu> Tested-by: "Junchang(Jason) Wang" <junchang.wang@yale.edu> Cc: netdev@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-27ext4: avoid hang when mounting non-journal filesystems with orphan listTheodore Ts'o1-1/+2
When trying to mount a file system which does not contain a journal, but which does have a orphan list containing an inode which needs to be truncated, the mount call with hang forever in ext4_orphan_cleanup() because ext4_orphan_del() will return immediately without removing the inode from the orphan list, leading to an uninterruptible loop in kernel code which will busy out one of the CPU's on the system. This can be trivially reproduced by trying to mount the file system found in tests/f_orphan_extents_inode/image.gz from the e2fsprogs source tree. If a malicious user were to put this on a USB stick, and mount it on a Linux desktop which has automatic mounts enabled, this could be considered a potential denial of service attack. (Not a big deal in practice, but professional paranoids worry about such things, and have even been known to allocate CVE numbers for such problems.) Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Reviewed-by: Zheng Liu <wenqing.lz@taobao.com> Cc: stable@vger.kernel.org
2012-12-27ext4: lock i_mutex when truncating orphan inodesTheodore Ts'o1-0/+2
Commit c278531d39 added a warning when ext4_flush_unwritten_io() is called without i_mutex being taken. It had previously not been taken during orphan cleanup since races weren't possible at that point in the mount process, but as a result of this c278531d39, we will now see a kernel WARN_ON in this case. Take the i_mutex in ext4_orphan_cleanup() to suppress this warning. Reported-by: Alexander Beregalov <a.beregalov@gmail.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Reviewed-by: Zheng Liu <wenqing.lz@taobao.com> Cc: stable@vger.kernel.org
2012-12-26f2fs: Don't assign e_id in f2fs_acl_from_diskEric W. Biederman1-1/+0
With user namespaces enabled building f2fs fails with: CC fs/f2fs/acl.o fs/f2fs/acl.c: In function ‘f2fs_acl_from_disk’: fs/f2fs/acl.c:85:21: error: ‘struct posix_acl_entry’ has no member named ‘e_id’ make[2]: *** [fs/f2fs/acl.o] Error 1 make[2]: Target `__build' not remade because of errors. e_id is a backwards compatibility field only used for file systems that haven't been converted to use kuids and kgids. When the posix acl tag field is neither ACL_USER nor ACL_GROUP assigning e_id is unnecessary. Remove the assignment so f2fs will build with user namespaces enabled. Cc: Namjae Jeon <namjae.jeon@samsung.com> Cc: Amit Sahrawat <a.sahrawat@samsung.com> Acked-by: Jaegeuk Kim <jaegeuk.kim@samsung.com> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2012-12-26proc: Allow proc_free_inum to be called from any contextEric W. Biederman1-6/+7
While testing the pid namespace code I hit this nasty warning. [ 176.262617] ------------[ cut here ]------------ [ 176.263388] WARNING: at /home/eric/projects/linux/linux-userns-devel/kernel/softirq.c:160 local_bh_enable_ip+0x7a/0xa0() [ 176.265145] Hardware name: Bochs [ 176.265677] Modules linked in: [ 176.266341] Pid: 742, comm: bash Not tainted 3.7.0userns+ #18 [ 176.266564] Call Trace: [ 176.266564] [<ffffffff810a539f>] warn_slowpath_common+0x7f/0xc0 [ 176.266564] [<ffffffff810a53fa>] warn_slowpath_null+0x1a/0x20 [ 176.266564] [<ffffffff810ad9ea>] local_bh_enable_ip+0x7a/0xa0 [ 176.266564] [<ffffffff819308c9>] _raw_spin_unlock_bh+0x19/0x20 [ 176.266564] [<ffffffff8123dbda>] proc_free_inum+0x3a/0x50 [ 176.266564] [<ffffffff8111d0dc>] free_pid_ns+0x1c/0x80 [ 176.266564] [<ffffffff8111d195>] put_pid_ns+0x35/0x50 [ 176.266564] [<ffffffff810c608a>] put_pid+0x4a/0x60 [ 176.266564] [<ffffffff8146b177>] tty_ioctl+0x717/0xc10 [ 176.266564] [<ffffffff810aa4d5>] ? wait_consider_task+0x855/0xb90 [ 176.266564] [<ffffffff81086bf9>] ? default_spin_lock_flags+0x9/0x10 [ 176.266564] [<ffffffff810cab0a>] ? remove_wait_queue+0x5a/0x70 [ 176.266564] [<ffffffff811e37e8>] do_vfs_ioctl+0x98/0x550 [ 176.266564] [<ffffffff810b8a0f>] ? recalc_sigpending+0x1f/0x60 [ 176.266564] [<ffffffff810b9127>] ? __set_task_blocked+0x37/0x80 [ 176.266564] [<ffffffff810ab95b>] ? sys_wait4+0xab/0xf0 [ 176.266564] [<ffffffff811e3d31>] sys_ioctl+0x91/0xb0 [ 176.266564] [<ffffffff810a95f0>] ? task_stopped_code+0x50/0x50 [ 176.266564] [<ffffffff81939199>] system_call_fastpath+0x16/0x1b [ 176.266564] ---[ end trace 387af88219ad6143 ]--- It turns out that spin_unlock_bh(proc_inum_lock) is not safe when put_pid is called with another spinlock held and irqs disabled. For now take the easy path and use spin_lock_irqsave(proc_inum_lock) in proc_free_inum and spin_loc_irq in proc_alloc_inum(proc_inum_lock). Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2012-12-25ext4: do not try to write superblock on ro remount w/o journalMichael Tokarev1-1/+1
When a journal-less ext4 filesystem is mounted on a read-only block device (blockdev --setro will do), each remount (for other, unrelated, flags, like suid=>nosuid etc) results in a series of scary messages from kernel telling about I/O errors on the device. This is becauese of the following code ext4_remount(): if (sbi->s_journal == NULL) ext4_commit_super(sb, 1); at the end of remount procedure, which forces writing (flushing) of a superblock regardless whenever it is dirty or not, if the filesystem is readonly or not, and whenever the device itself is readonly or not. We only need call ext4_commit_super when the file system had been previously mounted read/write. Thanks to Eric Sandeen for help in diagnosing this issue. Signed-off-By: Michael Tokarev <mjt@tls.msk.ru> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: stable@vger.kernel.org
2012-12-25ext4: include journal blocks in df overhead calcsEric Sandeen1-0/+4
To more accurately calculate overhead for "bsd" style df reporting, we should count the journal blocks as overhead as well. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Tested-by: Eric Whitney <enwlinux@gmail.com>
2012-12-25ext4: remove unaligned AIO warning printkEric Sandeen1-8/+0
Although I put this in, I now think it was a bad decision. For most users, there is very little to be done in this case. They get the message, once per day, with no real context or proposed action. TBH, it generates support calls when it probably does not need to; the message sounds more dire than the situation really is. Just nuke it. Normal investigation via blktrace or whatnot can reveal poor IO patterns if bad performance is encountered. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2012-12-25ext4: fix an incorrect comment about i_mutexAndy Lutomirski1-2/+0
i_mutex is not held when ->sync_file is called. Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Andy Lutomirski <luto@amacapital.net> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2012-12-25ext4: fix deadlock in journal_unmap_buffer()Jan Kara2-24/+85
We cannot wait for transaction commit in journal_unmap_buffer() because we hold page lock which ranks below transaction start. We solve the issue by bailing out of journal_unmap_buffer() and jbd2_journal_invalidatepage() with -EBUSY. Caller is then responsible for waiting for transaction commit to finish and try invalidation again. Since the issue can happen only for page stradding i_size, it is simple enough to manually call jbd2_journal_invalidatepage() for such page from ext4_setattr(), check the return value and wait if necessary. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2012-12-25ext4: split off ext4_journalled_invalidatepage()Jan Kara1-7/+16
In data=journal mode we don't need delalloc or DIO handling in invalidatepage and similarly in other modes we don't need the journal handling. So split invalidatepage implementations. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2012-12-22Merge branch 'for-next' of git://git.samba.org/sfrench/cifs-2.6Linus Torvalds4-17/+18
Pull CIFS fixes from Steve French: "Misc small cifs fixes" * 'for-next' of git://git.samba.org/sfrench/cifs-2.6: cifs: eliminate cifsERROR variable cifs: don't compare uniqueids in cifs_prime_dcache unless server inode numbers are in use cifs: fix double-free of "string" in cifs_parse_mount_options
2012-12-22Revert "nfsd: warn on odd reply state in nfsd_vfs_read"J. Bruce Fields1-1/+0
This reverts commit 79f77bf9a4e3dd5ead006b8f17e7c4ff07d8374e. This is obviously wrong, and I have no idea how I missed seeing the warning in testing: I must just not have looked at the right logs. The caller bumps rq_resused/rq_next_page, so it will always be hit on a large enough read. Reported-by: Dave Jones <davej@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-21NFS: Kill fscache warnings when mounting without -ofscTrond Myklebust1-4/+15
The fscache code will currently bleat a "non-unique superblock keys" warning even if the user is mounting without the 'fsc' option. There should be no reason to even initialise the superblock cache cookie unless we're planning on using fscache for something, so ensure that we check for the NFS_OPTION_FSCACHE flag before calling into the fscache code. Reported-by: Paweł Sikora <pawel.sikora@agmk.net> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: David Howells <dhowells@redhat.com> Acked-by: David Howells <dhowells@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-21NFS: Provide stub nfs_fscache_wait_on_invalidate() for when CONFIG_NFS_FSCACHE=nDavid Howells1-0/+1
Provide a stub nfs_fscache_wait_on_invalidate() function for when CONFIG_NFS_FSCACHE=n lest the following error appear: fs/nfs/inode.c: In function 'nfs_invalidate_mapping': fs/nfs/inode.c:887:2: error: implicit declaration of function 'nfs_fscache_wait_on_invalidate' [-Werror=implicit-function-declaration] cc1: some warnings being treated as errors Reported-by: kbuild test robot <fengguang.wu@intel.com> Reported-by: Vineet Gupta <Vineet.Gupta1@synopsys.com> Reported-by: Borislav Petkov <bp@alien8.de> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-21jbd2: fix assertion failure in jbd2_journal_flush()Jan Kara1-1/+2
The following race is possible between start_this_handle() and someone calling jbd2_journal_flush(). Process A Process B start_this_handle(). if (journal->j_barrier_count) # false if (!journal->j_running_transaction) { #true read_unlock(&journal->j_state_lock); jbd2_journal_lock_updates() jbd2_journal_flush() write_lock(&journal->j_state_lock); if (journal->j_running_transaction) { # false ... wait for committing trans ... write_unlock(&journal->j_state_lock); ... write_lock(&journal->j_state_lock); if (!journal->j_running_transaction) { # true jbd2_get_transaction(journal, new_transaction); write_unlock(&journal->j_state_lock); goto repeat; # eventually blocks on j_barrier_count > 0 ... J_ASSERT(!journal->j_running_transaction); # fails We fix the race by rechecking j_barrier_count after reacquiring j_state_lock in exclusive mode. Reported-by: yjwsignal@empal.com Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: stable@vger.kernel.org
2012-12-21Merge branch 'for-next' of git://git.infradead.org/users/eparis/notifyLinus Torvalds11-104/+152
Pull filesystem notification updates from Eric Paris: "This pull mostly is about locking changes in the fsnotify system. By switching the group lock from a spin_lock() to a mutex() we can now hold the lock across things like iput(). This fixes a problem involving unmounting a fs and having inodes be busy, first pointed out by FAT, but reproducible with tmpfs. This also restores signal driven I/O for inotify, which has been broken since about 2.6.32." Ugh. I *hate* the timing of this. It was rebased after the merge window opened, and then left to sit with the pull request coming the day before the merge window closes. That's just crap. But apparently the patches themselves have been around for over a year, just gathering dust, so now it's suddenly critical. Fixed up semantic conflict in fs/notify/fdinfo.c as per Stephen Rothwell's fixes from -next. * 'for-next' of git://git.infradead.org/users/eparis/notify: inotify: automatically restart syscalls inotify: dont skip removal of watch descriptor if creation of ignored event failed fanotify: dont merge permission events fsnotify: make fasync generic for both inotify and fanotify fsnotify: change locking order fsnotify: dont put marks on temporary list when clearing marks by group fsnotify: introduce locked versions of fsnotify_add_mark() and fsnotify_remove_mark() fsnotify: pass group to fsnotify_destroy_mark() fsnotify: use a mutex instead of a spinlock to protect a groups mark list fanotify: add an extra flag to mark_remove_from_mask that indicates wheather a mark should be destroyed fsnotify: take groups mark_lock before mark lock fsnotify: use reference counting for groups fsnotify: introduce fsnotify_get_group() inotify, fanotify: replace fsnotify_put_group() with fsnotify_destroy_group()
2012-12-21Merge branch 'akpm' (Andrew's patch-bomb)Linus Torvalds13-23/+85
Merge the rest of Andrew's patches for -rc1: "A bunch of fixes and misc missed-out-on things. That'll do for -rc1. I still have a batch of IPC patches which still have a possible bug report which I'm chasing down." * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (25 commits) keys: use keyring_alloc() to create module signing keyring keys: fix unreachable code sendfile: allows bypassing of notifier events SGI-XP: handle non-fatal traps fat: fix incorrect function comment Documentation: ABI: remove testing/sysfs-devices-node proc: fix inconsistent lock state linux/kernel.h: fix DIV_ROUND_CLOSEST with unsigned divisors memcg: don't register hotcpu notifier from ->css_alloc() checkpatch: warn on uapi #includes that #include <uapi/... revert "rtc: recycle id when unloading a rtc driver" mm: clean up transparent hugepage sysfs error messages hfsplus: add error message for the case of failure of sync fs in delayed_sync_fs() method hfsplus: rework processing of hfs_btree_write() returned error hfsplus: rework processing errors in hfsplus_free_extents() hfsplus: avoid crash on failed block map free kcmp: include linux/ptrace.h drivers/rtc/rtc-imxdi.c: must include <linux/spinlock.h> mm: cma: WARN if freed memory is still in use exec: do not leave bprm->interp on stack ...
2012-12-21Merge branch 'for-linus' of ↵Linus Torvalds64-435/+1096
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull VFS update from Al Viro: "fscache fixes, ESTALE patchset, vmtruncate removal series, assorted misc stuff." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (79 commits) vfs: make lremovexattr retry once on ESTALE error vfs: make removexattr retry once on ESTALE vfs: make llistxattr retry once on ESTALE error vfs: make listxattr retry once on ESTALE error vfs: make lgetxattr retry once on ESTALE vfs: make getxattr retry once on an ESTALE error vfs: allow lsetxattr() to retry once on ESTALE errors vfs: allow setxattr to retry once on ESTALE errors vfs: allow utimensat() calls to retry once on an ESTALE error vfs: fix user_statfs to retry once on ESTALE errors vfs: make fchownat retry once on ESTALE errors vfs: make fchmodat retry once on ESTALE errors vfs: have chroot retry once on ESTALE error vfs: have chdir retry lookup and call once on ESTALE error vfs: have faccessat retry once on an ESTALE error vfs: have do_sys_truncate retry once on an ESTALE error vfs: fix renameat to retry on ESTALE errors vfs: make do_unlinkat retry once on ESTALE errors vfs: make do_rmdir retry once on ESTALE errors vfs: add a flags argument to user_path_parent ...
2012-12-21Merge branch 'for-linus' of ↵Linus Torvalds1-21/+0
git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal Pull signal handling cleanups from Al Viro: "sigaltstack infrastructure + conversion for x86, alpha and um, COMPAT_SYSCALL_DEFINE infrastructure. Note that there are several conflicts between "unify SS_ONSTACK/SS_DISABLE definitions" and UAPI patches in mainline; resolution is trivial - just remove definitions of SS_ONSTACK and SS_DISABLED from arch/*/uapi/asm/signal.h; they are all identical and include/uapi/linux/signal.h contains the unified variant." Fixed up conflicts as per Al. * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal: alpha: switch to generic sigaltstack new helpers: __save_altstack/__compat_save_altstack, switch x86 and um to those generic compat_sys_sigaltstack() introduce generic sys_sigaltstack(), switch x86 and um to it new helper: compat_user_stack_pointer() new helper: restore_altstack() unify SS_ONSTACK/SS_DISABLE definitions new helper: current_user_stack_pointer() missing user_stack_pointer() instances Bury the conditionals from kernel_thread/kernel_execve series COMPAT_SYSCALL_DEFINE: infrastructure
2012-12-21sendfile: allows bypassing of notifier eventsScott Wolchok1-0/+2
do_sendfile() in fs/read_write.c does not call the fsnotify functions, unlike its neighbors. This manifests as a lack of inotify ACCESS events when a file is sent using sendfile(2). Addresses https://bugzilla.kernel.org/show_bug.cgi?id=12812 [akpm@linux-foundation.org: use fsnotify_modify(out.file), not fsnotify_access(), per Dave] Signed-off-by: Alan Cox <alan@linux.intel.com> Cc: Dave Chinner <david@fromorbit.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Scott Wolchok <swolchok@umich.edu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-21fat: fix incorrect function commentRavishankar N3-4/+7
fat_search_long() returns 0 on success, -ENOENT/ENOMEM on failure. Change the function comment accordingly. While at it, fix some trivial typos. Signed-off-by: Ravishankar N <cyberax82@gmail.com> Signed-off-by: Namjae Jeon <linkinjeon@gmail.com> Acked-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-21proc: fix inconsistent lock stateXiaotian Feng1-6/+6
Lockdep found an inconsistent lock state when rcu is processing delayed work in softirq. Currently, kernel is using spin_lock/spin_unlock to protect proc_inum_ida, but proc_free_inum is called by rcu in softirq context. Use spin_lock_bh/spin_unlock_bh fix following lockdep warning. ================================= [ INFO: inconsistent lock state ] 3.7.0 #36 Not tainted --------------------------------- inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage. swapper/1/0 [HC0[0]:SC1[1]:HE1:SE0] takes: (proc_inum_lock){+.?...}, at: proc_free_inum+0x1c/0x50 {SOFTIRQ-ON-W} state was registered at: __lock_acquire+0x8ae/0xca0 lock_acquire+0x199/0x200 _raw_spin_lock+0x41/0x50 proc_alloc_inum+0x4c/0xd0 alloc_mnt_ns+0x49/0xc0 create_mnt_ns+0x25/0x70 mnt_init+0x161/0x1c7 vfs_caches_init+0x107/0x11a start_kernel+0x348/0x38c x86_64_start_reservations+0x131/0x136 x86_64_start_kernel+0x103/0x112 irq event stamp: 2993422 hardirqs last enabled at (2993422): _raw_spin_unlock_irqrestore+0x55/0x80 hardirqs last disabled at (2993421): _raw_spin_lock_irqsave+0x29/0x70 softirqs last enabled at (2993394): _local_bh_enable+0x13/0x20 softirqs last disabled at (2993395): call_softirq+0x1c/0x30 other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(proc_inum_lock); <Interrupt> lock(proc_inum_lock); *** DEADLOCK *** no locks held by swapper/1/0. stack backtrace: Pid: 0, comm: swapper/1 Not tainted 3.7.0 #36 Call Trace: <IRQ> [<ffffffff810a40f1>] ? vprintk_emit+0x471/0x510 print_usage_bug+0x2a5/0x2c0 mark_lock+0x33b/0x5e0 __lock_acquire+0x813/0xca0 lock_acquire+0x199/0x200 _raw_spin_lock+0x41/0x50 proc_free_inum+0x1c/0x50 free_pid_ns+0x1c/0x50 put_pid_ns+0x2e/0x50 put_pid+0x4a/0x60 delayed_put_pid+0x12/0x20 rcu_process_callbacks+0x462/0x790 __do_softirq+0x1b4/0x3b0 call_softirq+0x1c/0x30 do_softirq+0x59/0xd0 irq_exit+0x54/0xd0 smp_apic_timer_interrupt+0x95/0xa3 apic_timer_interrupt+0x72/0x80 cpuidle_enter_tk+0x10/0x20 cpuidle_enter_state+0x17/0x50 cpuidle_idle_call+0x287/0x520 cpu_idle+0xba/0x130 start_secondary+0x2b3/0x2bc Signed-off-by: Xiaotian Feng <dannyfeng@tencent.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-21hfsplus: add error message for the case of failure of sync fs in ↵Vyacheslav Dubeyko1-1/+4
delayed_sync_fs() method Add an error message for the case of failure of sync fs in delayed_sync_fs() method. Signed-off-by: Vyacheslav Dubeyko <slava@dubeyko.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Hin-Tak Leung <htl10@users.sourceforge.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-21hfsplus: rework processing of hfs_btree_write() returned errorVyacheslav Dubeyko3-5/+12
Add to hfs_btree_write() a return of -EIO on failure of b-tree node searching. Also add logic ofor processing errors from hfs_btree_write() in hfsplus_system_write_inode() with a message about b-tree writing failure. [akpm@linux-foundation.org: reduce scope of `err', print errno on error] Signed-off-by: Vyacheslav Dubeyko <slava@dubeyko.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Acked-by: Hin-Tak Leung <htl10@users.sourceforge.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-21hfsplus: rework processing errors in hfsplus_free_extents()Vyacheslav Dubeyko1-4/+20
Currently, it doesn't process error codes from the hfsplus_block_free() call in hfsplus_free_extents() method. Add some error code processing. Signed-off-by: Vyacheslav Dubeyko <slava@dubeyko.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Hin-Tak Leung <htl10@users.sourceforge.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-21hfsplus: avoid crash on failed block map freeAlan Cox1-1/+12
If the read fails we kmap an error code. This doesn't end well. Instead print a critical error and pray. This mirrors the rest of the fs behaviour with critical error cases. Acked-by: Vyacheslav Dubeyko <slava@dubeyko.com> Signed-off-by: Alan Cox <alan@linux.intel.com> Signed-off-by: Vyacheslav Dubeyko <slava@dubeyko.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Christoph Hellwig <hch@lst.de> Cc: Jan Kara <jack@suse.cz> Acked-by: Hin-Tak Leung <htl10@users.sourceforge.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-21exec: do not leave bprm->interp on stackKees Cook3-2/+22
If a series of scripts are executed, each triggering module loading via unprintable bytes in the script header, kernel stack contents can leak into the command line. Normally execution of binfmt_script and binfmt_misc happens recursively. However, when modules are enabled, and unprintable bytes exist in the bprm->buf, execution will restart after attempting to load matching binfmt modules. Unfortunately, the logic in binfmt_script and binfmt_misc does not expect to get restarted. They leave bprm->interp pointing to their local stack. This means on restart bprm->interp is left pointing into unused stack memory which can then be copied into the userspace argv areas. After additional study, it seems that both recursion and restart remains the desirable way to handle exec with scripts, misc, and modules. As such, we need to protect the changes to interp. This changes the logic to require allocation for any changes to the bprm->interp. To avoid adding a new kmalloc to every exec, the default value is left as-is. Only when passing through binfmt_script or binfmt_misc does an allocation take place. For a proof of concept, see DoTest.sh from: http://www.halfdog.net/Security/2012/LinuxKernelBinfmtScriptStackDataDisclosure/ Signed-off-by: Kees Cook <keescook@chromium.org> Cc: halfdog <me@halfdog.net> Cc: P J P <ppandit@redhat.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-21vfs: make lremovexattr retry once on ESTALE errorJeff Layton1-2/+7
Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-12-21vfs: make removexattr retry once on ESTALEJeff Layton1-2/+7
Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-12-21vfs: make llistxattr retry once on ESTALE errorJeff Layton1-2/+7
Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-12-21vfs: make listxattr retry once on ESTALE errorJeff Layton1-2/+7
Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-12-21vfs: make lgetxattr retry once on ESTALEJeff Layton1-2/+7
Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-12-21vfs: make getxattr retry once on an ESTALE errorJeff Layton1-2/+7
Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-12-21vfs: allow lsetxattr() to retry once on ESTALE errorsJeff Layton1-2/+7
Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-12-21vfs: allow setxattr to retry once on ESTALE errorsJeff Layton1-2/+7
Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-12-21vfs: allow utimensat() calls to retry once on an ESTALE errorJeff Layton1-1/+5
Clearly, we can't handle the NULL filename case, but we can deal with the case where there's a real pathname. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-12-21vfs: fix user_statfs to retry once on ESTALE errorsJeff Layton1-1/+8
Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-12-21vfs: make fchownat retry once on ESTALE errorsJeff Layton1-0/+5
Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-12-21vfs: make fchmodat retry once on ESTALE errorsJeff Layton1-2/+7
Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-12-21vfs: have chroot retry once on ESTALE errorJeff Layton1-2/+7
Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-12-21vfs: have chdir retry lookup and call once on ESTALE errorJeff Layton1-2/+7
Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-12-21vfs: have faccessat retry once on an ESTALE errorJeff Layton1-2/+7
Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-12-21vfs: have do_sys_truncate retry once on an ESTALE errorJeff Layton1-1/+7
Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-12-21vfs: fix renameat to retry on ESTALE errorsJeff Layton1-3/+12
...as always, rename is the messiest of the bunch. We have to track whether to retry or not via a separate flag since the error handling is already quite complex. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-12-21vfs: make do_unlinkat retry once on ESTALE errorsJeff Layton1-2/+8
Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>