summaryrefslogtreecommitdiff
path: root/fs
AgeCommit message (Collapse)AuthorFilesLines
2013-12-12kernfs: drop s_ prefix from kernfs_node membersTejun Heo8-178/+174
kernfs has just been separated out from sysfs and we're already in full conflict mode. Nothing can make the situation any worse. Let's take the chance to name things properly. s_ prefix for kernfs members is used inconsistently and a misnomer now. It's not like kernfs_node is used widely across the kernel making the ability to grep for the members particularly useful. Let's just drop the prefix. This patch is strictly rename only and doesn't introduce any functional difference. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-12-12kernfs: s/sysfs_dirent/kernfs_node/ and rename its friends accordinglyTejun Heo12-668/+669
kernfs has just been separated out from sysfs and we're already in full conflict mode. Nothing can make the situation any worse. Let's take the chance to name things properly. This patch performs the following renames. * s/sysfs_elem_dir/kernfs_elem_dir/ * s/sysfs_elem_symlink/kernfs_elem_symlink/ * s/sysfs_elem_attr/kernfs_elem_file/ * s/sysfs_dirent/kernfs_node/ * s/sd/kn/ in kernfs proper * s/parent_sd/parent/ * s/target_sd/target/ * s/dir_sd/parent/ * s/to_sysfs_dirent()/rb_to_kn()/ * misc renames of local vars when they conflict with the above Because md, mic and gpio dig into sysfs details, this patch ends up modifying them. All are sysfs_dirent renames and trivial. While we can avoid these by introducing a dummy wrapping struct sysfs_dirent around kernfs_node, given the limited usage outside kernfs and sysfs proper, I don't think such workaround is called for. This patch is strictly rename only and doesn't introduce any functional difference. - mic / gpio renames were missing. Spotted by kbuild test robot. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Neil Brown <neilb@suse.de> Cc: Linus Walleij <linus.walleij@linaro.org> Cc: Ashutosh Dixit <ashutosh.dixit@intel.com> Cc: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-12-11sysfs: fix use-after-free in sysfs_kill_sb()Tejun Heo1-1/+3
While restructuring the [u]mount path, 4b93dc9b1c68 ("sysfs, kernfs: prepare mount path for kernfs") incorrectly updated sysfs_kill_sb() so that it first kills super_block and then tries to dereference its namespace tag to drop it. Fix it by caching namespace tag before killing the superblock and then drop the cached namespace tag. Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Yuanhan Liu <yuanhan.liu@linux.intel.com> Tested-by: Yuanhan Liu <yuanhan.liu@linux.intel.com> Tested-by: Vlastimil Babka <vbabka@suse.cz> Link: http://lkml.kernel.org/g/20131205031051.GC5135@yliu-dev.sh.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-12-11sysfs: bail early from kernfs_file_mmap() to avoid spurious lockdep warningTejun Heo2-8/+22
This is v3.14 fix for the same issue that a8b14744429f ("sysfs: give different locking key to regular and bin files") addresses for v3.13. Due to the extensive kernfs reorganization in v3.14 branch, the same fix couldn't be ported as-is. The v3.13 fix was ignored while merging it into v3.14 branch. 027a485d12e0 ("sysfs: use a separate locking class for open files depending on mmap") assigned different lockdep key to sysfs_open_file->mutex depending on whether the file implements mmap or not in an attempt to avoid spurious lockdep warning caused by merging of regular and bin file paths. While this restored some of the original behavior of using different locks (at least lockdep is concerned) for the different clases of files. The restoration wasn't full because now the lockdep key assignment depends on whether the file has mmap or not instead of whether it's a regular file or not. This means that bin files which don't implement mmap will get assigned the same lockdep class as regular files. This is problematic because file_operations for bin files still implements the mmap file operation and checking whether the sysfs file actually implements mmap happens in the file operation after grabbing @sysfs_open_file->mutex. We still end up adding locking dependency from mmap locking to sysfs_open_file->mutex to the regular file mutex which triggers spurious circular locking warning. For v3.13, a8b14744429f ("sysfs: give different locking key to regular and bin files") fixed it by giving sysfs_open_file->mutex different lockdep keys depending on whether the file is regular or bin instead of whether mmap exists or not; however, due to the way sysfs is now layered behind kernfs, this approach is no longer viable. kernfs can tell whether a sysfs node has mmap implemented or not but can't tell whether a bin file from a regular one. This patch updates kernfs such that kernfs_file_mmap() checks SYSFS_FLAG_HAS_MMAP and bail before grabbing sysfs_open_file->mutex so that it doesn't add spurious locking dependency from mmap to sysfs_open_file->mutex and changes sysfs so that it specifies kernfs_ops->mmap iff the sysfs file implements mmap. Combined, this ensures that sysfs_open_file->mutex is grabbed under mmap path iff the sysfs file actually implements mmap. As sysfs_open_file->mutex is already given a different lockdep key if mmap is implemented, this removes the spurious locking dependency. Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Dave Jones <davej@redhat.com> Link: http://lkml.kernel.org/g/20131203184324.GA11320@redhat.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-12-10Merge branch 'driver-core-linus' into driver-core-nextTejun Heo17-121/+103
a8b14744429f ("sysfs: give different locking key to regular and bin files") in driver-core-linus modifies sysfs_open_file() so that it gives out different locking classes to sysfs_open_files depending on whether the file is bin or not. Due to the massive kernfs reorganization in driver-core-next, this naturally causes merge conflict in fs/sysfs/file.c. Due to the way things are split between kernfs and sysfs in driver-core-next, the same fix can't easily be applied to driver-core-next. This merge simply ignores the offending commit. A following patch will implement a separate fix for the issue. Signed-off-by: Tejun Heo <tj@kernel.org>
2013-12-09sysfs, kernfs: remove duplicated include from file.cWei Yongjun1-1/+0
Remove duplicated include. Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-12-08sysfs: give different locking key to regular and bin filesTejun Heo1-5/+3
027a485d12e0 ("sysfs: use a separate locking class for open files depending on mmap") assigned different lockdep key to sysfs_open_file->mutex depending on whether the file implements mmap or not in an attempt to avoid spurious lockdep warning caused by merging of regular and bin file paths. While this restored some of the original behavior of using different locks (at least lockdep is concerned) for the different clases of files. The restoration wasn't full because now the lockdep key assignment depends on whether the file has mmap or not instead of whether it's a regular file or not. This means that bin files which don't implement mmap will get assigned the same lockdep class as regular files. This is problematic because file_operations for bin files still implements the mmap file operation and checking whether the sysfs file actually implements mmap happens in the file operation after grabbing @sysfs_open_file->mutex. We still end up adding locking dependency from mmap locking to sysfs_open_file->mutex to the regular file mutex which triggers spurious circular locking warning. Fix it by restoring the original behavior fully by differentiating lockdep key by whether the file is regular or bin, instead of the existence of mmap. Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Dave Jones <davej@redhat.com> Link: http://lkml.kernel.org/g/20131203184324.GA11320@redhat.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-12-06Merge git://git.kvack.org/~bcrl/aio-nextLinus Torvalds1-2/+6
Pull aio fix from Benjamin LaHaise: "AIO fix from Gu Zheng that fixes a GPF that Dave Jones uncovered with trinity" * git://git.kvack.org/~bcrl/aio-next: aio: clean up aio ring in the fail path
2013-12-06aio: clean up aio ring in the fail pathGu Zheng1-2/+6
Clean up the aio ring file in the fail path of aio_setup_ring and ioctx_alloc. And maybe it can fix the GPF issue reported by Dave Jones: https://lkml.org/lkml/2013/11/25/898 Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com> Signed-off-by: Benjamin LaHaise <bcrl@kvack.org>
2013-12-06Merge tag 'pm-3.13-rc3' of ↵Linus Torvalds1-2/+1
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management fixes from Rafael Wysocki: - cpufreq regression fix from Bjørn Mork restoring the pre-3.12 behavior of the framework during system suspend/hibernation to avoid garbage sysfs files from being left behind in case of a suspend error - PNP regression fix to restore the correct states of devices after resume from hibernation broken in 3.12. From Dmitry Torokhov. - cpuidle fix to prevent cpuidle device unregistration from crashing due to a NULL pointer dereference if cpuidle has been disabled from the kernel command line. From Konrad Rzeszutek Wilk. - intel_idle fix for the C6 state definition on Intel Avoton/Rangeley processors from Arne Bockholdt. - Power capping framework fix to make the energy_uj sysfs attribute work in accordance with the documentation. From Srinivas Pandruvada. - epoll fix to make it ignore the EPOLLWAKEUP flag if the kernel has been compiled with CONFIG_PM_SLEEP unset (in which case that flag should not have any effect). From Amit Pundir. - cpufreq fix to prevent governor sysfs files from being lost over system suspend/resume in some (arguably unusual) situations. From Viresh Kumar. * tag 'pm-3.13-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: PowerCap: Fix mode for energy counter PNP: fix restoring devices after hibernation cpuidle: Check for dev before deregistering it. epoll: drop EPOLLWAKEUP if PM_SLEEP is disabled cpufreq: fix garbage kobjects on errors during suspend/resume cpufreq: suspend governors on system suspend/hibernate intel_idle: Fixed C6 state on Avoton/Rangeley processors
2013-12-06Merge branch 'for-linus' of git://git.kernel.dk/linux-blockLinus Torvalds6-87/+22
Pull block layer fixes from Jens Axboe: "A small collection of fixes for the current series. It contains: - A fix for a use-after-free of a request in blk-mq. From Ming Lei - A fix for a blk-mq bug that could attempt to dereference a NULL rq if allocation failed - Two xen-blkfront small fixes - Cleanup of submit_bio_wait() type uses in the kernel, unifying that. From Kent - A fix for 32-bit blkg_rwstat reading. I apologize for this one looking mangled in the shortlog, it's entirely my fault for missing an empty line between the description and body of the text" * 'for-linus' of git://git.kernel.dk/linux-block: blk-mq: fix use-after-free of request blk-mq: fix dereference of rq->mq_ctx if allocation fails block: xen-blkfront: Fix possible NULL ptr dereference xen-blkfront: Silence pfn maybe-uninitialized warning block: submit_bio_wait() conversions Update of blkg_stat and blkg_rwstat may happen in bh context
2013-12-06Merge tag 'nfs-for-3.13-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfsLinus Torvalds7-9/+51
Pull NFS client bugfixes from Trond Myklebust: - Stable fix for a NFSv4.1 delegation and state recovery deadlock - Stable fix for a loop on irrecoverable errors when returning delegations - Fix a 3-way deadlock between layoutreturn, open, and state recovery - Update the MAINTAINERS file with contact information for Trond Myklebust - Close needs to handle NFS4ERR_ADMIN_REVOKED - Enabling v4.2 should not recompile nfsd and lockd - Fix a couple of compile warnings * tag 'nfs-for-3.13-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: nfs: fix do_div() warning by instead using sector_div() MAINTAINERS: Update contact information for Trond Myklebust NFSv4.1: Prevent a 3-way deadlock between layoutreturn, open and state recovery SUNRPC: do not fail gss proc NULL calls with EACCES NFSv4: close needs to handle NFS4ERR_ADMIN_REVOKED NFSv4: Update list of irrecoverable errors on DELEGRETURN NFSv4 wait on recovery for async session errors NFS: Fix a warning in nfs_setsecurity NFS: Enabling v4.2 should not recompile nfsd and lockd
2013-12-04nfs: fix do_div() warning by instead using sector_div()Helge Deller1-1/+1
When compiling a 32bit kernel with CONFIG_LBDAF=n the compiler complains like shown below. Fix this warning by instead using sector_div() which is provided by the kernel.h header file. fs/nfs/blocklayout/extents.c: In function ‘normalize’: include/asm-generic/div64.h:43:28: warning: comparison of distinct pointer types lacks a cast [enabled by default] fs/nfs/blocklayout/extents.c:47:13: note: in expansion of macro ‘do_div’ nfs/blocklayout/extents.c:47:2: warning: right shift count >= width of type [enabled by default] fs/nfs/blocklayout/extents.c:47:2: warning: passing argument 1 of ‘__div64_32’ from incompatible pointer type [enabled by default] include/asm-generic/div64.h:35:17: note: expected ‘uint64_t *’ but argument is of type ‘sector_t *’ extern uint32_t __div64_32(uint64_t *dividend, uint32_t divisor); Signed-off-by: Helge Deller <deller@gmx.de> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-12-04NFSv4.1: Prevent a 3-way deadlock between layoutreturn, open and state recoveryTrond Myklebust1-1/+8
Andy Adamson reports: The state manager is recovering expired state and recovery OPENs are being processed. If kswapd is pruning inodes at the same time, a deadlock can occur when kswapd calls evict_inode on an NFSv4.1 inode with a layout, and the resultant layoutreturn gets an error that the state mangager is to handle, causing the layoutreturn to wait on the (NFS client) cl_rpcwaitq. At the same time an open is waiting for the inode deletion to complete in __wait_on_freeing_inode. If the open is either the open called by the state manager, or an open from the same open owner that is holding the NFSv4 sequence id which causes the OPEN from the state manager to wait for the sequence id on the Seqid_waitqueue, then the state is deadlocked with kswapd. The fix is simply to have layoutreturn ignore all errors except NFS4ERR_DELAY. We already know that layouts are dropped on all server reboots, and that it has to be coded to deal with the "forgetful client model" that doesn't send layoutreturns. Reported-by: Andy Adamson <andros@netapp.com> Link: http://lkml.kernel.org/r/1385402270-14284-1-git-send-email-andros@netapp.com Signed-off-by: Trond Myklebust <Trond.Myklebust@primarydata.com>
2013-12-04Merge tag 'squashfs-fixes' of ↵Linus Torvalds1-1/+4
git://git.kernel.org/pub/scm/linux/kernel/git/pkl/squashfs-next Pull squashfs bugfix from Phillip Lougher: "Just a single bug fix to the new "directly decompress into the page cache" code" * tag 'squashfs-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/pkl/squashfs-next: Squashfs: fix failure to unlock pages on decompress error
2013-12-04kernfs: implement "trusted.*" xattr supportTejun Heo4-11/+74
kernfs inherited "security.*" xattr support from sysfs. This patch extends xattr support to "trusted.*" using simple_xattr_*(). As trusted xattrs are restricted to CAP_SYS_ADMIN, simple_xattr_*() which uses kernel memory for storage shouldn't be problematic. Note that the existing "security.*" support doesn't implement get/remove/list and the this patch only implements those ops for "trusted.*". We probably want to extend those ops to include support for "security.*". This patch will allow using kernfs from cgroup which requires "trusted.*" xattr support. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: David P. Quigley <dpquigl@tycho.nsa.gov> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-12-04kernfs: update sysfs_init_inode_attrs()Tejun Heo2-37/+29
sysfs_init_inode_attrs() is a bit clumsy to use requiring the caller to check whether @sd->s_iattr is already set or not. Rename it to sysfs_inode_attrs(), update it to check whether @sd->s_iattr is already initialized before trying to initialize it and return @sd->s_iattr. This simplifies the callers. While at it, * Rename struct sysfs_inode_attrs pointer variables to "attrs". As kernfs no longer deals with "struct attribute", this isn't confusing and makes it easier to distinguish from struct iattr pointers. * A new field will be added to sysfs_inode_attrs. Reindent in preparation. This patch doesn't introduce any behavior changes. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-12-03epoll: drop EPOLLWAKEUP if PM_SLEEP is disabledAmit Pundir1-2/+1
Drop EPOLLWAKEUP from epoll events mask if CONFIG_PM_SLEEP is disabled. Signed-off-by: Amit Pundir <amit.pundir@linaro.org> Cc: John Stultz <john.stultz@linaro.org> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-12-02vfs: fix subtle use-after-free of pipe_inode_infoLinus Torvalds1-20/+19
The pipe code was trying (and failing) to be very careful about freeing the pipe info only after the last access, with a pattern like: spin_lock(&inode->i_lock); if (!--pipe->files) { inode->i_pipe = NULL; kill = 1; } spin_unlock(&inode->i_lock); __pipe_unlock(pipe); if (kill) free_pipe_info(pipe); where the final freeing is done last. HOWEVER. The above is actually broken, because while the freeing is done at the end, if we have two racing processes releasing the pipe inode info, the one that *doesn't* free it will decrement the ->files count, and unlock the inode i_lock, but then still use the "pipe_inode_info" afterwards when it does the "__pipe_unlock(pipe)". This is *very* hard to trigger in practice, since the race window is very small, and adding debug options seems to just hide it by slowing things down. Simon originally reported this way back in July as an Oops in kmem_cache_allocate due to a single bit corruption (due to the final "spin_unlock(pipe->mutex.wait_lock)" incrementing a field in a different allocation that had re-used the free'd pipe-info), it's taken this long to figure out. Since the 'pipe->files' accesses aren't even protected by the pipe lock (we very much use the inode lock for that), the simple solution is to just drop the pipe lock early. And since there were two users of this pattern, create a helper function for it. Introduced commit ba5bb147330a ("pipe: take allocation and freeing of pipe_inode_info out of ->i_mutex"). Reported-by: Simon Kirby <sim@hostway.ca> Reported-by: Ian Applegate <ia@cloudflare.com> Acked-by: Al Viro <viro@zeniv.linux.org.uk> Cc: stable@kernel.org # v3.10+ Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-11-30sysfs, kernfs: remove cross inclusions of internal headersTejun Heo2-3/+0
fs/kernfs/kernfs-internal.h needed to include fs/sysfs/sysfs.h because part of kernfs core implementation was living in sysfs. fs/sysfs/sysfs.h needed to include fs/kernfs/kernfs-internal.h because include/linux/kernfs.h didn't expose enough interface. The separation is complete and neither is true anymore. Remove the cross inclusion and make sysfs a proper user of kernfs. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-11-30sysfs, kernfs: implement kernfs_ns_enabled()Tejun Heo3-7/+7
fs/sysfs/symlink.c::sysfs_delete_link() tests @sd->s_flags for SYSFS_FLAG_NS. Let's add kernfs_ns_enabled() so that sysfs doesn't have to test sysfs_dirent flag directly. This makes things tidier for kernfs proper too. This is purely cosmetic. v2: To avoid possible NULL deref, use noop dummy implementation which always returns false when !CONFIG_SYSFS. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-11-30sysfs, kernfs: make sysfs_dirent definition publicTejun Heo2-99/+1
sysfs_dirent includes some information which should be available to kernfs users - the type, flags, name and parent pointer. This patch moves sysfs_dirent definition from kernfs/kernfs-internal.h to include/linux/kernfs.h so that kernfs users can access them. The type part of flags is exported as enum kernfs_node_type, the flags kernfs_node_flag, sysfs_type() and kernfs_enable_ns() are moved to include/linux/kernfs.h and the former is updated to return the enum type. sysfs_dirent->s_parent and ->s_name are marked explicitly as public. This patch doesn't introduce any functional changes. v2: Flags exported too and kernfs_enable_ns() definition moved. v3: While moving kernfs_enable_ns() to include/linux/kernfs.h, v1 and v2 put the definition outside CONFIG_SYSFS replacing the dummy implementation with the actual implementation too. Unfortunately, this can lead to oops when !CONFIG_SYSFS because kernfs_enable_ns() may be called on a NULL @sd and now tries to dereference @sd instead of not doing anything. This issue was reported by Yuanhan Liu. Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Yuanhan Liu <yuanhan.liu@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-11-30sysfs, kernfs: move mount core code to fs/kernfs/mount.cTejun Heo4-170/+178
Move core mount code to fs/kernfs/mount.c. The respective declarations in fs/sysfs/sysfs.h are moved to fs/kernfs/kernfs-internal.h. This is pure relocation. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-11-30sysfs, kernfs: prepare mount path for kernfsTejun Heo4-35/+78
We're in the process of separating out core sysfs functionality into kernfs which will deal with sysfs_dirents directly. This patch rearranges mount path so that the kernfs and sysfs parts are separate. * As sysfs_super_info won't be visible outside kernfs proper, kernfs_super_ns() is added to allow kernfs users to access a super_block's namespace tag. * Generic mount operation is separated out into kernfs_mount_ns(). sysfs_mount() now just performs sysfs-specific permission check, acquires namespace tag, and invokes kernfs_mount_ns(). * Generic superblock release is separated out into kernfs_kill_sb() which can be used directly as file_system_type->kill_sb(). As sysfs needs to put the namespace tag, sysfs_kill_sb() wraps kernfs_kill_sb() with ns tag put. * sysfs_dir_cachep init and sysfs_inode_init() are separated out into kernfs_init(). kernfs_init() uses only small amount of memory and trying to handle and propagate kernfs_init() failure doesn't make much sense. Use SLAB_PANIC for sysfs_dir_cachep and make sysfs_inode_init() panic on failure. After this change, kernfs_init() should be called before sysfs_init(), fs/namespace.c::mnt_init() modified accordingly. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: linux-fsdevel@vger.kernel.org Cc: Christoph Hellwig <hch@infradead.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-11-30sysfs, kernfs: make super_blocks bind to different kernfs_rootsTejun Heo2-4/+12
kernfs is being updated to allow multiple sysfs_dirent hierarchies so that it can also be used by other users. Currently, sysfs super_blocks are always attached to one kernfs_root - sysfs_root - and distinguished only by their namespace tags. This patch adds sysfs_super_info->root and update sysfs_fill/test_super() so that super_blocks are identified by the combination of both the associated kernfs_root and namespace tag. This allows mounting different kernfs hierarchies. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-11-30sysfs, kernfs: make inode number ida per kernfs_rootTejun Heo4-38/+19
kernfs is being updated to allow multiple sysfs_dirent hierarchies so that it can also be used by other users. Currently, inode number is allocated using a global ida, sysfs_ino_ida; however, inos for different hierarchies should be handled separately. This patch makes ino allocation per kernfs_root. sysfs_ino_ida is replaced by kernfs_root->ino_ida and sysfs_new_dirent() is updated to take @root and allocate ino from it. ida_simple_get/remove() are used instead of sysfs_ino_lock and sysfs_alloc/free_ino(). Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-11-30sysfs, kernfs: implement kernfs_create/destroy_root()Tejun Heo3-20/+100
There currently is single kernfs hierarchy in the whole system which is used for sysfs. kernfs needs to support multiple hierarchies to allow other users. This patch introduces struct kernfs_root which serves as the root of each kernfs hierarchy and implements kernfs_create/destroy_root(). * Each kernfs_root is associated with a root sd (sysfs_dentry). The root is freed when the root sd is released and kernfs_destory_root() simply invokes kernfs_remove() on the root sd. sysfs_remove_one() is updated to handle release of the root sd. Note that ps_iattr update in sysfs_remove_one() is trivially updated for readability. * Root sd's are now dynamically allocated using sysfs_new_dirent(). Update sysfs_alloc_ino() so that it gives out ino from 1 so that the root sd still gets ino 1. * While kernfs currently only points to the root sd, it'll soon grow fields which are specific to each hierarchy. As determining a given sd's root will be necessary, sd->s_dir.root is added. This backlink fits better as a separate field in sd; however, sd->s_dir is inside union with space to spare, so use it to save space and provide kernfs_root() accessor to determine the root sd. * As hierarchies may be destroyed now, each mount needs to hold onto the hierarchy it's attached to. Update sysfs_fill_super() and sysfs_kill_sb() so that they get and put the kernfs_root respectively. * sysfs_root is replaced with kernfs_root which is dynamically created by invoking kernfs_create_root() from sysfs_init(). This patch doesn't introduce any visible behavior changes. v2: kernfs_create_root() forgot to set @sd->priv. Fixed. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-11-30sysfs, kernfs: introduce sysfs_root_sdTejun Heo4-9/+11
Currently, it's assumed that there's a single kernfs hierarchy in the system anchored at sysfs_root which is defined as a global struct. To allow other users of kernfs, this will be made dynamic. Introduce a new global variable sysfs_root_sd which points to &sysfs_root and convert all &sysfs_root users. This patch doesn't introduce any behavior difference. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-11-30sysfs, kernfs: no need to kern_mount() sysfs from sysfs_init()Tejun Heo1-16/+7
It has been very long since sysfs depended on vfs to keep track of internal states and whether sysfs is mounted or not doesn't make any difference to sysfs's internal operation. In addition to init and filesystem type registration, sysfs_init() invokes kern_mount() to create in-kernel mount of sysfs. This internal mounting doesn't server any purpose anymore. Remove it. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-11-30sysfs, kernfs: make sysfs_super_info->ns constTejun Heo2-8/+8
Add const qualifier to sysfs_super_info->ns so that it's consistent with other namespace tag usages in sysfs. Because kobject doesn't use const qualifier for namespace tags, this ends up requiring an explicit cast to drop const qualifier in free_sysfs_super_info(). Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-11-30sysfs, kernfs: drop unused params from sysfs_fill_super()Tejun Heo1-2/+2
sysfs_fill_super() takes three params - @sb, @data and @silent - but uses only @sb. Drop the latter two. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-11-30sysfs, kernfs: move symlink core code to fs/kernfs/symlink.cTejun Heo4-138/+144
Move core symlink code to fs/kernfs/symlink.c. fs/sysfs/symlink.c now only contains sysfs wrappers around kernfs interfaces. The respective declarations in fs/sysfs/sysfs.h are moved to fs/kernfs/kernfs-internal.h. This is pure relocation. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-11-30sysfs, kernfs: move file core code to fs/kernfs/file.cTejun Heo4-805/+813
Move core file code to fs/kernfs/file.c. fs/sysfs/file.c now contains sysfs kernfs_ops callbacks, sysfs wrappers around kernfs interfaces, and sysfs_schedule_callback(). The respective declarations in fs/sysfs/sysfs.h are moved to fs/kernfs/kernfs-internal.h. This is pure relocation. v2: Refreshed on top of the v2 of "sysfs, kernfs: prepare read path for kernfs". v3: Refreshed on top of the v3 of "sysfs, kernfs: prepare read path for kernfs". Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-11-30sysfs, kernfs: move dir core code to fs/kernfs/dir.cTejun Heo4-999/+1005
Move core dir code to fs/kernfs/dir.c. fs/sysfs/dir.c now only contains sysfs_warn_dup() and sysfs wrappers around kernfs interfaces. The respective declarations in fs/sysfs/sysfs.h are moved to fs/kernfs/kernfs-internal.h. This is pure relocation. v2: sysfs_symlink_target_lock was mistakenly relocated to kernfs. It should remain with sysfs. Fixed. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-11-30sysfs, kernfs: move inode code to fs/kernfs/inode.cTejun Heo5-356/+341
There's nothing sysfs-specific in fs/sysfs/inode.c. Move everything in it to fs/kernfs/inode.c. The respective declarations in fs/sysfs/sysfs.h are moved to fs/kernfs/kernfs-internal.h. This is pure relocation. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-11-30sysfs, kernfs: move internal decls to fs/kernfs/kernfs-internal.hTejun Heo2-96/+121
Move data structure, constant and basic accessor declarations from fs/sysfs/sysfs.h to fs/kernfs/kernfs-internal.h. The two files currently include each other. Once kernfs / sysfs separation is complete, the cross inclusions will be removed. Inclusion protectors are added to fs/sysfs/sysfs.h to allow cross-inclusion. This patch doesn't introduce any functional changes. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-11-30sysfs, kernfs: introduce kernfs[_find_and]_get() and kernfs_put()Tejun Heo7-127/+118
Introduce kernfs interface for finding, getting and putting sysfs_dirents. * sysfs_find_dirent() is renamed to kernfs_find_ns() and lockdep assertion for sysfs_mutex is added. * sysfs_get_dirent_ns() is renamed to kernfs_find_and_get(). * Macro inline dancing around __sysfs_get/put() are removed and kernfs_get/put() are made proper functions implemented in fs/sysfs/dir.c. While the conversions are mostly equivalent, there's one difference - kernfs_get() doesn't return the input param as its return value. This change is intentional. While passing through the input increases writability in some areas, it is unnecessary and has been shown to cause confusion regarding how the last ref is handled. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-11-30sysfs, kernfs: revamp sysfs_dirent active_ref lockdep annotationTejun Heo3-44/+27
Currently, sysfs_dirent active_ref lockdep annotation uses attribute->[s]key as the lockdep key, which forces kernfs_create_file_ns() to assume that sysfs_dirent->priv is pointing to a struct attribute which may not be true for non-sysfs users. This patch restructures the lockdep annotation such that * kernfs_ops contains lockdep_key which is used by default for files created kernfs_create_file_ns(). * kernfs_create_file_ns_key() is introduced which takes an extra @key argument. The created file will use the specified key for active_ref lockdep annotation. If NULL is specified, lockdep for the file is disabled. * sysfs_add_file_mode_ns() is updated to use kernfs_create_file_ns_key() with the appropriate key from the attribute or NULL if ignore_lockdep is set. This makes the lockdep annotation properly contained in kernfs while allowing sysfs to cleanly keep its current behavior. This patch doesn't introduce any behavior differences. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-11-30sysfs, kernfs: reorganize SYSFS_* constantsTejun Heo1-6/+6
We want to add one more SYSFS_FLAG_* but we can't use the next higher bit, 0x10000, as the flag field is 16bits wide. The flags are currently arranged weirdly - 8 bits are set aside for the type flags when there are only three three used, the first flag starts at 0x1000 instead of 0x0100 and flag literals have 5 digits (20 bits) when only 4 digits can be used. Rearrange them so that type bits are only the lowest four, flags start at 0x0010 and similar flags are grouped. This patch doesn't cause any behavior difference. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-11-30sysfs, kernfs: introduce kernfs_notify()Tejun Heo1-11/+22
Introduce kernfs interface to wake up poll(2) which takes and returns sysfs_dirents. sysfs_notify_dirent() is renamed to kernfs_notify() and sysfs_notify() is updated so that it doesn't directly grab sysfs_mutex but acquires the target sysfs_dirents using sysfs_get_dirent(). sysfs_notify_dirent() is reimplemented as a dumb inline wrapper around kernfs_notify(). This patch doesn't introduce any behavior changes. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-11-30sysfs, kernfs: add kernfs_ops->seq_{start|next|stop}()Tejun Heo1-11/+28
kernfs_ops currently only supports single_open() behavior which is pretty restrictive. Add optional callbacks ->seq_{start|next|stop}() which, when implemented, are invoked for seq_file traversal. This allows full seq_file functionality for kernfs users. This currently doesn't have any user and doesn't change any behavior. v2: Refreshed on top of the updated "sysfs, kernfs: prepare read path for kernfs". Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-11-30sysfs, kernfs: remove sysfs_add_one()Tejun Heo4-41/+6
sysfs_add_one() is a wrapper around __sysfs_add_one() which prints out duplicate name warning if __sysfs_add_one() fails with -EEXIST. The previous kernfs conversions moved all dup warnings to sysfs interface functions and sysfs_add_one() doesn't have any user left. Remove sysfs_add_one() and update __sysfs_add_one() to take its name. This patch doesn't make any functional changes. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-11-30sysfs, kernfs: introduce kernfs_create_file[_ns]()Tejun Heo1-11/+42
Introduce kernfs interface to create a file which takes and returns sysfs_dirents. The actual file creation part is separated out from sysfs_add_file_mode_ns() into kernfs_create_file_ns(). The former now only decides the kernfs_ops to use and the file's size and invokes the latter. This patch doesn't introduce behavior changes. v2: Dummy implementation for !CONFIG_SYSFS updated to return -ENOSYS. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-11-30sysfs, kernfs: remove SYSFS_KOBJ_BIN_ATTRTejun Heo4-26/+14
After kernfs_ops and sysfs_dirent->s_attr.size addition, the distinction between SYSFS_KOBJ_BIN_ATTR and SYSFS_KOBJ_ATTR is only necessary while creating files to decide which kernfs_ops to use. Afterwards, they behave exactly the same. This patch removes SYSFS_KOBJ_BIN_ATTR along with sysfs_is_bin(). sysfs_add_file[_mode_ns]() are updated to take bool @is_bin instead of @type. This patch doesn't introduce any behavior changes. This completely isolates the distinction between the two sysfs file types in the sysfs layer proper. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-11-30sysfs, kernfs: add sysfs_dirent->s_attr.sizeTejun Heo3-7/+8
sysfs sets the size of regular files unconditionally at PAGE_SIZE and takes the size of bin files from bin_attribute. The latter is a pretty bad interface which forces bin_attribute users to create a separate copy of bin_attribute for each instance of the file - e.g. pci resource files. Add sysfs_dirent->s_attr.size so that the size can be specified separately. This unifies inode init paths of ATTR and BIN_ATTR identical and allows for generic size handling for kernfs. Unfortunately, this grows the size of sysfs_dirent by sizeof(loff_t). Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-11-30sysfs, kernfs: introduce kernfs_opsTejun Heo2-34/+115
We're in the process of separating out core sysfs functionality into kernfs which will deal with sysfs_dirents directly. This patch introduces kernfs_ops which hosts methods kernfs users implement and updates fs/sysfs/file.c such that sysfs_kf_*() functions populate kernfs_ops and kernfs_file_*() functions call the matching entries from kernfs_ops. kernfs_ops contains the following groups of methods. * seq_show() - for kernfs files which use seq_file for reads. * read() - for direct read implementations. Used iff seq_show() is not implemented. * write() - for writes. * mmap() - for mmaps. Notes: * sysfs_elem_attr->ops is added so that kernfs_ops can be accessed from sysfs_dirent. kernfs_ops() helper is added to verify locking and access the field. * SYSFS_FLAG_HAS_(SEQ_SHOW|MMAP) added. sd->s_attr->ops is accessible only while holding active_ref and there are cases where we want to take different actions depending on which ops are implemented. These two flags cache whether the two ops are implemented for those. * kernfs_file_*() no longer test sysfs type but chooses different behaviors depending on which methods in kernfs_ops are implemented. The conversions are trivial except for the open path. As kernfs_file_open() now decides whether to allow read/write accesses depending on the kernfs_ops implemented, the presence of methods in kobjs and attribute_bin should be propagated to kernfs_ops. sysfs_add_file_mode_ns() is updated so that it propagates presence / absence of the callbacks through _empty, _ro, _wo, _rw kernfs_ops. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-11-30sysfs, kernfs: move sysfs_open_file to include/linux/kernfs.hTejun Heo1-11/+0
sysfs_open_file will be used as the primary handle for kernfs methods. Move its definition from fs/sysfs/file.c to include/linux/kernfs.h and mark the public and private fields. This is pure relocation. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-11-30sysfs, kernfs: prepare open, release, poll paths for kernfsTejun Heo3-21/+10
We're in the process of separating out core sysfs functionality into kernfs which will deal with sysfs_dirents directly. This patch prepares the rest - open, release and poll. There isn't much to do. Just renaming is enough. As sysfs_file_operations and sysfs_bin_operations are identical now, use the same file_operations for both - kernfs_file_operations. This patch doesn't introduce any behavior changes. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-11-30sysfs, kernfs: prepare mmap path for kernfsTejun Heo1-30/+39
We're in the process of separating out core sysfs functionality into kernfs which will deal with sysfs_dirents directly. This patch rearranges mmap path so that the kernfs and sysfs parts are separate. sysfs_kf_bin_mmap() which handles the interaction with bin_attribute mmap method is factored out of sysfs_bin_mmap(), which is renamed to kernfs_file_mmap(). All vma ops are renamed accordingly. sysfs_bin_mmap() is updated such that it can be used for both file types. This will eventually allow using the same file_operations for both file types, which is necessary to separate out kernfs. This patch doesn't introduce any behavior changes. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-11-30sysfs, kernfs: prepare write path for kernfsTejun Heo1-53/+50
We're in the process of separating out core sysfs functionality into kernfs which will deal with sysfs_dirents directly. This patch rearranges write path so that the kernfs and sysfs parts are separate. kernfs_file_write() handles all boilerplate work including buffer management and locking and invokes sysfs_kf_write() or sysfs_kf_bin_write() depending on the file type which deals with the interaction with kobj store or bin_attribute write method. While this patch changes the order of some operations, it shouldn't change any visible behavior. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>